In [1]:
# Recursive training example selection for entity resolution
#
# Copyright 2015 Peter Christen, Dinusha Vatsalan and Qing Wang
# Email: peter.christen@anu.edu.au
# Web:   http://users.cecs.anu.edu.au/~christen/
#
#   This program is free software: you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation, either version 3 of the License, or
#   (at your option) any later version.
#
#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU General Public License for more details.
#
#   You should have received a copy of the GNU General Public License
#   along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
# -----------------------------------------------------------------------------
#
# TODO: - allow several weight vectors per corner in select_corners function
#       - what to do with clusters which are pure but too large
#         (> max_cluster_size), how to split? (they need to be split)
#
# Usage:
#
# python recursive-train-selection.py [weight_vector_file] [min_data_set_size]
#                                     [weights_to_use] [init_method]
#                                     [select_method] [sample_error]
#                                     [oracle_acc] [split_classifier]
#                                     [min_cluster_size] [max_cluster_size]
#                                     [min_purity] [budget_num_class]
#                                     [res_file_name]
# where:
#
# weight_vector_file  The name of the weight vector file, with an assumed
#                     format of: [rec_id1,rec_id2,true_match_status,w1,w2..]
#                     where: rec_id1/rec_id2 are the record identifiers,
#                            true_match is 1.0 if the weight vector was
#                              generated by a true matching record pair,
#                              0.0 otherwise,
#                            w1, w2, ..., wd the d weights (normalised into
#                              0..1)
# min_data_set_size   The number of records in the smaller of the two data
#                     sets, used to calculate the initial proportion of
#                     matches.
#
# weights_to_use      A list of numbers of the weights to use. Must be of
#                     the format [x,y,z] with x/y/z>=0 and x/y/z<=(d-1) and no
#                     spaces between numbers.
#
# init_method         How to select the initial weight vectors, can either be
#                     'far' (select weight vectors farthest away from each
#                     other), '01' (select weight vectors closest to [0,..,0]
#                     and [1,..,1], 'corner' (select weight vectors closest to
#                     all corners of the multi-dimensional space), and 'ran'
#                     for a random selection.
#
# select_method       Which approach to use when selecting representative
#                     weight vectors in a cluster, possible are:
#                       'far' (only use farthest away weight vectors), or
#                       'far_med' (also include medoid weight vector)
#                       'dense' (density based)
#                       'aggl'  (agglomerative clustering based)
#                       'ran' (random selection)
#
# sample_error        When calculating number of samples to use for a certain
#                     cluster, the margin of error required. Must be a value
#                     between 0.0 and below 1.0 (typical 0.05 to 0.2)
#
# oracle_acc          The assumed accuracy of the oracle a value between 0.5
#                     and 1.
# 
# split_classifier    The approach used to classify and split matches from
#                     non-matches, possible are:
#                       'knn'   (for k nearest neighbour classifier)
#                       'svm'   (for support vector machine classifier)
#                       'dtree' (for decision tree classifier)
#
# min_cluster_size    The minimum size of a cluster allowed, clusters are not
#                     split further below that size. Must be a positive number.
#
# max_cluster_size    The maximum size of a cluster allowed before it is added
#                     to the training set. Must be a positive number larger
#                     than min_cluster_size.
#
# min_purity          The minimum 'purity' (i.e. accuracy) of a cluster, once
#                     this is reached a cluster will not be split further. Must
#                     be a value between 0.5 and 1.
#
# budget_num_class    The number of manual classifications we can do by the
#                     oracle.
#
# queue_order         The method used to order the queue of clusters and
#                     decide which cluster to process next. Possible are:
#                     - 'fifo'      First in first out, our initial approach
#                                   (PAKDD'15)
#                     - 'random'    Randomly select a cluster from the queue
#                     - 'max_puri'  Cluster with highest purity first (based
#                                   on purity of parent cluster, as child
#                                   cluster purity can only be larger)
#                     - 'min_puri'  Clusters with lowest purity first (again
#                                   based on parent cluster purity)
#                     - 'min_entr'  Clusters with lowest entropy first (based 
#                                   on entropy of parent cluster, as child
#                                   cluster entropy can only be smaller)
#                     - 'max_entr'  Clusters with highest entropy first (again
#                                   based on parent cluster entropy)
#                     - 'max_size'  Largest clusters first
#                     - 'min_size'  Smallest clusters first
#                     - 'close_01'  Closest to 0 or 1 corner
#                     - 'close_mid' Closest to middle (half between 0 and 1
#                                   corners)
#                     - 'balance'   Select so that training set sizes are
#                                   balanced
#                     - 'sample'    Select cluster which has largest ratio of
#                                   cluster size divided by number of number
#                                   of samples required
#
# res_file_name       Name of the file where results are to be appended.

# TODO: Maybe add: ****************************
# fuzzy_reg_ratio     A value between 0 and 1 giving the maximum ratio of the
#                     fuzzy region from where not to select weight vectors in
#                     the splitting phase (i.e. if distances between matches
#                     and non-matches are nearly the same (idea adapted from:
#                     http://link.springer.com/chapter/10.1007/11677437_12,
#                     Section 4.2)

# -----------------------------------------------------------------------------
#
import gzip
import itertools
import random
import math
import os
import sys
import time

import numpy
import sklearn.svm
import sklearn.tree
import sklearn.cluster

import csv
import copy

random.seed(42)

# -----------------------------------------------------------------------------
# Define some constant values
 
# Which column in weight vector file contains match status
#
# TRUE_MATCH_COL = 2
TRUE_MATCH_COL = 6

# Maximum allowed pureness before a weight vector is completely removed from
# input weight vector set (if larger then only the minority 'class' weight
# vectors are removed)
#
MIN_REMOVE_PURENESS = 0.1

# k-nn classifier k value
#
KNN_K = 7

# Number of weight vectors to sample (per cluster) if a set is too large in the
# agglomerative selection approach
#
NUM_SAMPLE = 100

RUN_SVM = False  # If the final SVM classifers are to be run or not

# Which distance calculation to use for cluster minimums
#
CLUSTER_MIN_DIST = 'average'  # One of: 'single', 'average', 'complete'

# -----------------------------------------------------------------------------
# Process command line arguments
#
#weight_vec_file_name = sys.argv[1]
weight_vec_file_name = 'vetorSimilaridades-14-03.csv'
# weight_vec_file_name = 'vetorSimilaridades-DEMO.csv'

#min_data_set_size = int(sys.argv[2])
min_data_set_size = 1295
assert min_data_set_size > 1

#weights_to_use = sys.argv[3]  # List of weights to use (format: [i,j,.. ,k])
# weights_to_use = '[0,1,2,3,4]'  # List of weights to use (format: [i,j,.. ,k])
weights_to_use = '[0,1,2,3,4,5,6]'  # List of weights to use (format: [i,j,.. ,k])
weights_to_use_list = eval(weights_to_use)
num_weights = len(weights_to_use_list)

#print '  num_weights %d iden' % (num_unique_weight_vectors)

for i in range(num_weights):  # Convert into indices in weight vector lists
  assert weights_to_use_list[i] >= 0
  #weights_to_use_list[i] += 3 #A partir da 4ª coluna (coluna 3)
  weights_to_use_list[i] += 7 #A partir da 8ª coluna (coluna 7)
    

#init_method = sys.argv[4]
init_method = 'far'
assert init_method in ['far', '01', 'corner', 'ran']

#select_method = sys.argv[5]
select_method = 'far'
assert select_method in ['far', 'far_med', 'ran', 'dense', 'aggl']

#sample_error = float(sys.argv[6])
sample_error = 0.1
assert sample_error > 0.0 and sample_error < 1.0

#oracle_acc = float(sys.argv[7])
oracle_acc = 1.0
assert oracle_acc > 0.5 and oracle_acc <= 1.0

#split_classifier = sys.argv[8]
split_classifier = 'svm'
assert split_classifier in ['knn', 'svm', 'dtree']

#min_cluster_size = int(sys.argv[9])
min_cluster_size = 20
assert min_cluster_size >= 1

#max_cluster_size = int(sys.argv[10])
max_cluster_size = 50
assert max_cluster_size > min_cluster_size

#min_purity = float(sys.argv[11])
min_purity = 0.9
assert min_purity > 0.5 and min_purity < 1.0

#budget_num_class = int(sys.argv[12])
#budget_num_class = 1000
budget_num_class = 100
assert budget_num_class >= 1

#queue_order = sys.argv[13]
queue_order = 'random'
assert queue_order in ['fifo', 'random', 'max_puri', 'min_puri', 'max_size',
                       'min_size', 'min_entr', 'max_entr', 'close_01',
                       'close_mid', 'balance', 'sample']

#res_file_name = sys.argv[14]
res_file_name = 'resultado.csv'

weight_vector_dict_orig = {}

# -----------------------------------------------------------------------------
# Function to load weight vector file
#
def load_weight_vec_file(in_file_name, weights_to_use_list):
  """Load weight vector file, return a dictionary of weight vectors (as lists)
     of the form:
                 [rec_id1,rec_id2,true_match_status,w1,w2,...,wd]
     where only the selected weights in the 'weights_to_use_list' are being
     kept.

     Returns a list of the names of these weights (from header line) as well
     as the list of weight vectors, where each weight vector is a tuple of
     weights.
  """

  print
  print 'Load weight vector file:', in_file_name

  # Load weight vector file, extract weights from selected attributes
  #
  weight_vector_dict = {}  # Keys will be record identifiers

  num_dup_rec_ids = 0  # Number of record IDs (antity ID pairs) that occur
                       # several times

  if (in_file_name.endswith('.gz')):
    in_file = gzip.open(in_file_name)
  else:
    in_file = open(in_file_name)
  header_line = in_file.readline()
  #header_list = header_line.strip().split(',') #Remove todos os espaços em branco e divide a partir das vírgulas
  header_list = header_line.strip().split(';') #Remove todos os espaços em branco e divide a partir das vírgulas

  weights_name_list = []
  for w in weights_to_use_list:
    weights_name_list.append(header_list[w])
  print '  Weights to use:', weights_name_list

  for line in in_file: #Para cada linha no arquivo
    line_list = line.strip().split(';') #Lista com todas as linhas
    #print line_list
    #rec_id =    line_list[0].strip()+'-'+line_list[1].strip() #Armazena os identificadores dos pares!
    rec_id =    line_list[0].strip()+'~'+line_list[1].strip() #Armazena os identificadores dos pares!

    # Check if a certain entity-ID pair occurs more than once
    #
    if (rec_id in weight_vector_dict):
      num_dup_rec_ids += 1
    assert line_list[TRUE_MATCH_COL] in ['0.0', '1.0']

    if (line_list[TRUE_MATCH_COL] == '1.0'):
      match_status = True
    else:
      match_status = False

    this_weight_vec = []
    for w in weights_to_use_list:
      this_weight_vec.append(float(line_list[w].strip()))

    weight_vector_dict[rec_id] = (match_status, tuple(this_weight_vec))

  print '  Number of weight vectors:', len(weight_vector_dict)
  print '    Number of entity ID pairs that occurred more than once:', \
        num_dup_rec_ids 
  print
  
  #Adicionado por mim
  
  #lst = weight_vector_dict.items() 
  
  #for i in [1,4,10]:
	#print lst[i] 
	
  sub = 0.1889763779527559
  sub2 = 0.5487804878048781
     
    
  for k,v in weight_vector_dict.iteritems(): #v[0] corresponde à classe; 
											 #v[1] corresponde a uma tupla com o vetor de pesos
    if (sub in v[1]) & (sub2 in v[1]):
        print k
  #Isso é retornado:
	#REC_ID-422-REC_ID-230
	#REC_ID-425-REC_ID-217
	#REC_ID-425-REC_ID-215
	#REC_ID-423-REC_ID-230
	#REC_ID-424-REC_ID-230
	  
  #print weight_vector_dict['REC_ID-1~REC_ID-0']
  
  

  return weights_name_list, weight_vector_dict

# -----------------------------------------------------------------------------
# Function to analyse the quality of a dictionary of weight vectors, and then
# remove weight vectors of low quality (i.e. those that correspond to both
# matches and non-matches).
#
def analyse_filter_weight_vectors(weight_vector_dict):
  """Find number of unique weight vectors, their frequencies, and their
     pureness (i.e. how many are matches/non-matches).

     Returns the modified weight vector dictionary with non-pure weight vectors
     removed, a dictionary with all unique weight vectors, as well as the
     number of true matches and non-matches in the original weight vector
     dictionary.
  """

  num_weight_vectors = len(weight_vector_dict)

  print 'Analyse set of %d weight vectors' % (num_weight_vectors)

  unique_weight_vec_dict = {}  # Keys are weight vectors, values counts of how
                               # of these are matches and non-matches
  num_true_matches =     0  # Count the number of true matches and non-matches
  num_true_non_matches = 0

  for rec_id in weight_vector_dict:
    (match_status, weight_vector_tuple) = weight_vector_dict[rec_id]

    match_count_list = unique_weight_vec_dict.get(weight_vector_tuple, [0,0])
    if (match_status == True):
      num_true_matches += 1
      match_count_list[0] += 1
    else:
      num_true_non_matches += 1
      match_count_list[1] += 1

    unique_weight_vec_dict[weight_vector_tuple] = match_count_list

  num_unique_weight_vectors = len(unique_weight_vec_dict)

  print '  Containing %d true matches and %d true non-matches' % \
        (num_true_matches, num_true_non_matches)
  print '    (%.2f%% true matches)' % \
        (100.0*num_true_matches / len(weight_vector_dict))

  print '  Identified %d unique weight vectors' % (num_unique_weight_vectors)

  count_dict = {}  # Counts of how often weight vectors occur
  for match_count_list in unique_weight_vec_dict.itervalues():
    weight_vec_count = sum(match_count_list)
    sum_count = count_dict.get(weight_vec_count, 0) + 1
    count_dict[weight_vec_count] = sum_count

  count_list = count_dict.items()
  count_list.sort()

  print '  Frequency distribution of occurences of weight vectors:'
  print '    Occurence : Number of weight vectors that occur that often'
  for (freq, count) in count_list:
    print '      %5d : %5d  (%.2f%%)' % \
          (freq, count, 100.0*count/num_unique_weight_vectors)
  print

  # Identify all non-pure weight vectors and remove from original dictionary of
  # weight vectors
  #
  non_pure_weight_vec_dict = {}  # Keys will be weight vector tuples, values
                                 # their pureness
  pureness_dict = {}  # Also collect statistics

  for (weight_vector_tuple, match_count_list) in \
                                           unique_weight_vec_dict.iteritems():
    pureness = float(match_count_list[0]) / sum(match_count_list)
    pureness_count = pureness_dict.get(pureness, 0) + 1
    pureness_dict[pureness] = pureness_count

    if (pureness not in [0.0, 1.0]):
      non_pure_weight_vec_dict[weight_vector_tuple] = pureness

  pureness_list = pureness_dict.items()
  pureness_list.sort(reverse=True)

  print 'Identified %d non-pure unique weight vectors' % \
        (len(non_pure_weight_vec_dict)), '(from %d unique weight vectors)' % \
        (len(unique_weight_vec_dict))

  print 'Pureness (as percentage of matches) for a certain unique weight ' + \
        'vector:'
  print '  Pureness : Count'
  for (pureness, count) in pureness_list:
    print '     %5.3f : %2d' % (pureness, count),
    if (pureness not in [0.0, 1.0]):
      if ((pureness < MIN_REMOVE_PURENESS) or \
          (pureness > (1.0- MIN_REMOVE_PURENESS))):
        print '  (minority class weight vectors with this pureness to be ' + \
              'removed)'
      else:
        print '  (all weight vectors with this pureness to be removed)'
    else:
      print
  print
  
  # Remove non-pure weight vectors from the original dictionary of weight
  # vectors (if pureness worse than MIN_REMOVE_PURENESS then remove any
  # weight vector, otherwise only remove minority 'class' weight vectors
  #
  for rec_id in weight_vector_dict.keys():
    match_status, weight_vector_tuple = weight_vector_dict[rec_id]

    if weight_vector_tuple in non_pure_weight_vec_dict:
      weight_vector_pureness = non_pure_weight_vec_dict[weight_vector_tuple]

      # Check if the weight vector has to be removed in any case
      #
      if ((weight_vector_pureness > MIN_REMOVE_PURENESS) and \
          (weight_vector_pureness < (1.0-MIN_REMOVE_PURENESS))):
        del weight_vector_dict[rec_id]

      else:  # Only remove if the weight vector is in the minority 'class'

        # Mostly non-matches, so only remove weight vectors which are true
        # matches
        #
        if (weight_vector_pureness <= MIN_REMOVE_PURENESS):
          if (match_status == True):
            del weight_vector_dict[rec_id]

        else:  # Mostly matches, so only remove true non-matches
          if (match_status == False):
            del weight_vector_dict[rec_id]

  print 'Removed %d non-pure weight vectors' % \
        (num_weight_vectors - len(weight_vector_dict))
  print

  # Generate a weighted dictionary of unique weight vectors, their counts, and
  # their match status
  #
  weighted_unique_weight_vec_dict = {}  # Keys are weight vectors, values are
                                        # counts of how often they occur and
                                        # their match status

  for rec_id in weight_vector_dict:
    (match_status, weight_vector_tuple) = weight_vector_dict[rec_id]

    if (weight_vector_tuple not in weighted_unique_weight_vec_dict):
      weight_vector_count_match_list = [1, match_status]
    else:
      weight_vector_count_match_list = \
                          weighted_unique_weight_vec_dict[weight_vector_tuple]
      weight_vector_count_match_list[0] += 1
      assert weight_vector_count_match_list[1] == match_status
    weighted_unique_weight_vec_dict[weight_vector_tuple] = \
                                                weight_vector_count_match_list

  print 'Final number of weight vectors to use:', len(weight_vector_dict)
  print '  Number of unique weight vectors:', \
        len(weighted_unique_weight_vec_dict)
  print

  return weight_vector_dict, weighted_unique_weight_vec_dict, \
         num_true_matches, num_true_non_matches

# -----------------------------------------------------------------------------
# Function to calculate Euclidean distance between two vectors
#
def euclidean_dist(vec1, vec2, vec_len):

  dist = 0.0

  for i in range(vec_len):
    x = vec1[i] - vec2[i]
    dist += x*x

  return math.sqrt(dist)

# -----------------------------------------------------------------------------
# Function to convert a weight vector into a string with a specific number of
# digits (with 3 as default)
#
def weight_vector_to_str(weight_vector, num_digit=3):
  weight_vector_str = '['
  for w in weight_vector:
    digit_str = str(round(w, num_digit))
    if (len(digit_str.split('.')[-1]) < num_digit):
      num_miss_zero = num_digit - len(digit_str.split('.')[-1])
      digit_str += '0'*num_miss_zero

    weight_vector_str = weight_vector_str+digit_str+', '
  weight_vector_str = weight_vector_str[:-2]+']'

  return weight_vector_str

# -----------------------------------------------------------------------------
# Function to select weight vectors closest to [0,..,0] and [1,..,1] corners
#
def select_01(weight_vec_dict, k, num_weights):
  """Return k selected weight vectors from the given weight vector dictionary
     that are closest to the  [0,..,0] and [1,..,1] corners, with k/2 selected
     in each corner (if k is odd one more weight vector is selected in the
     [1,...,1] (match) corner.

     Returns a new dictionary that contains k weight vectors.
  """

  print 'Initial selection of weight vectors closest to [0,..,0] and [1,..,1]'

  zero_vector = [0.0]*num_weights
  one_vector =  [1.0]*num_weights

  # Two lists with tuples (distance to 0/1 and record identifier)
  #
  zero_close_list = []  # List of weight vectors closest to [0,..,0]
  one_close_list =  []  # List of weight vectors closest to [1,..,1]

  # Calculate length of the two lists
  #
  zero_list_len = k/2
  if (k % 2 == 0):
    one_list_len =  k/2
  else:
    one_list_len =  k/2+1

  print '  Select %d weight vectors closest to zero and %d closest to one.' % \
        (zero_list_len, one_list_len)

  max_0_dist = -1.0  # Current maximum distance to 0
  max_1_dist = -1.0  # Current maximum distance to 1

  for weight_vector_tuple in weight_vec_dict:
    dist_to_0 = euclidean_dist(weight_vector_tuple, zero_vector, num_weights)
    dist_to_1 = euclidean_dist(weight_vector_tuple, one_vector, num_weights)

    #zero_close_list.append([dist_to_0, weight_vector_tuple])
    #one_close_list.append([dist_to_1, weight_vector_tuple])

    if (len(zero_close_list) < zero_list_len):  # List can grow
      zero_close_list.append([dist_to_0, weight_vector_tuple])
      if (dist_to_0 > max_0_dist):
        max_0_dist = dist_to_0
      zero_close_list.sort()
    elif (dist_to_0 < max_0_dist):  # We have to replace a list element
      zero_close_list = zero_close_list[:-1]  # Remove furthest away w. vector
      zero_close_list.append([dist_to_0, weight_vector_tuple])
      zero_close_list.sort()
      max_0_dist = zero_close_list[-1][0]

    if (len(one_close_list) < one_list_len):  # List can grow
      one_close_list.append([dist_to_1, weight_vector_tuple])
      if (dist_to_1 > max_1_dist):
        max_1_dist = dist_to_1
      one_close_list.sort()
    elif (dist_to_1 < max_1_dist):  # We have to replace a list element
      one_close_list = one_close_list[:-1]  # Remove furthest away w. vector
      one_close_list.append([dist_to_1, weight_vector_tuple])
      one_close_list.sort()
      max_1_dist = one_close_list[-1][0]

  print '  Closest to zero:'
  for (dist, weight_vector_tuple) in zero_close_list:
    print '    With distance to zero of %.4f: %s (%s)' % \
          (dist, weight_vector_to_str(weight_vector_tuple), \
           weight_vec_dict[weight_vector_tuple][1])
  print '  Closest to one:'
  for (dist, weight_vector_tuple) in one_close_list:
    print '    With distance to one of %.4f:  %s (%s)' % \
          (dist, weight_vector_to_str(weight_vector_tuple), \
           weight_vec_dict[weight_vector_tuple][1])
  print

  # Build a dictionary of selected weight vectors
  #
  selected_weight_vec_dict = {}
  for (dist, weight_vector_tuple) in zero_close_list+one_close_list:
    weight_vector_count_match_list = weight_vec_dict[weight_vector_tuple]

    assert weight_vector_tuple not in selected_weight_vec_dict
    selected_weight_vec_dict[weight_vector_tuple] = \
                                                weight_vector_count_match_list

  return selected_weight_vec_dict

# -----------------------------------------------------------------------------
# Function to select weight vectors closest to each of the 2^num_weights
# corners (one weight vector per corner)
#
def select_corners(weight_vec_dict, num_weights):
  """Return selected weight vectors from the given weight vector dictionary
     that are closest to each of the 2^num_weights corners, i.e. one weight
     vector each from [0,..,0], [0,..,1], ... [1,..,1].

     Returns a new dictionary that contains 2^num_weights weight vectors.
  """

  print 'Initial selection of weight vectors closest to corners (one per ' + \
        'corner)'

  # Build a dictionary of selected weight vectors
  #
  selected_weight_vec_dict = {}

  # Get all permutations of 0 / 1 weights
  #
  all_corners_weight_vector_list = [seq for seq in itertools.product([0.0,1.0],
                                    repeat=num_weights)]
  assert len(all_corners_weight_vector_list) == 2**num_weights, \
         (len(all_corners_weight_vector_list), 2**num_weights)

  for corner_weight_vector in all_corners_weight_vector_list:
    min_corner_dist = 99999.0

    for weight_vector_tuple in weight_vec_dict:

      # Only select a weight vector once
      #
      if (weight_vector_tuple not in selected_weight_vec_dict):
        corner_dist = euclidean_dist(weight_vector_tuple, corner_weight_vector,
                                     num_weights)
        if (corner_dist < min_corner_dist):
          min_corner_dist = corner_dist
          min_corner_weight_vector = weight_vector_tuple

    corner_weight_vector_count_match_list = \
              weight_vec_dict[min_corner_weight_vector]
    assert min_corner_weight_vector not in selected_weight_vec_dict
    selected_weight_vec_dict[min_corner_weight_vector] = \
                                          corner_weight_vector_count_match_list
    print '  Corner %s has closest weight vector with distance %.4f: %s (%s)' \
          % (weight_vector_to_str(corner_weight_vector), min_corner_dist,
             weight_vector_to_str(min_corner_weight_vector),
             weight_vec_dict[min_corner_weight_vector][1]) 
  print

  assert len(selected_weight_vec_dict) == 2**num_weights

  return selected_weight_vec_dict

# -----------------------------------------------------------------------------
# Function to select k weight vectors using the farthest first method
#
def select_farthest(weight_vec_dict, k, num_weights):
  """Return k selected weight vectors from the given weight vector dictionary
     based on farthest first approach.

     Returns a new dictionary that contains k weight vectors.
  """

  print 'Farthest first selection of %d weight vectors from %d vectors' % \
        (k, len(weight_vec_dict))

  if (k == len(weight_vec_dict)):  # Return all weight vectors
    selected_weight_vector_dict = weight_vec_dict.copy()
    return selected_weight_vector_dict

  # Keep a dictionary of the so far selected weight vectors
  #
  selected_weight_vector_dict = {}

  # Randomly select the first weight vector
  #
  first_weight_vector = weight_vec_dict.iterkeys().next()

  selected_weight_vector_dict[first_weight_vector] = \
                                          weight_vec_dict[first_weight_vector]

  # Loop until we have selected k+1 weight vectors (then remove the first one)
  #
  while (len(selected_weight_vector_dict) <= k):

    loop_max_dist = -1.0    # The maximum distance of any unselected weight
                            # vector to a selected weight vector
    loop_max_weight_vec = None  # The corresponding record identifier of the
                                # weight vector with this maximum distance

    # Find the weight vector farthest away from all so far selected weight
    # vectors
    #
    for this_weight_vector_tuple in weight_vec_dict:

      # Only consider those not selected so far
      #
      if this_weight_vector_tuple not in selected_weight_vector_dict:

        # Calculate minimum distance of the current weight vector to any so far
        # selected weight vectors
        #
        this_min_dist = 999.0

        for sel_weight_vector_tuple in selected_weight_vector_dict:
          assert sel_weight_vector_tuple != this_weight_vector_tuple

          dist = euclidean_dist(this_weight_vector_tuple, \
                                sel_weight_vector_tuple, num_weights)
          if (dist < this_min_dist):
            this_min_dist = dist

        # Check if this minimum distance is the largest distance in loop so far
        #
        if (this_min_dist > loop_max_dist):
          loop_max_dist =       this_min_dist
          loop_max_weight_vec = this_weight_vector_tuple

    # Add this new farthest weight vector to the selected weight vectors
    #
    selected_weight_vector_dict[loop_max_weight_vec] = \
                                           weight_vec_dict[loop_max_weight_vec]

  del selected_weight_vector_dict[first_weight_vector]

  assert len(selected_weight_vector_dict) == k

  print '  The selected farthest weight vectors are:'
  for sel_weight_vector in selected_weight_vector_dict:
    print '    %s (%s)' % (weight_vector_to_str(sel_weight_vector),
          selected_weight_vector_dict[sel_weight_vector][1])
  print

  return selected_weight_vector_dict

# -----------------------------------------------------------------------------
# Function to select k weight vectors that are densest to other weight vectors
#
def select_densest(weight_vec_dict, k, num_weights):
  """Return k selected weight vectors from the given weight vector dictionary
     using a density based approach.

     Returns a new dictionary that contains k weight vectors.
  """

# TOO slow - n^2 approach - how to make faster? numpy
# Density based clsutering?

  print 'Density-based selection of %d weight vectors from %d vectors' % \
        (k, len(weight_vec_dict))

  # Keep a list with average distance to all other weight vectors for each
  # weight vector
  #
  dist_list = []  # To have tuples (average distance, weight_vector)

  weight_vector_list = weight_vec_dict.keys()

  for this_weight_vector_tuple_1 in weight_vector_list:

    dist_sum = 0.0
    for this_weight_vector_tuple_2 in weight_vector_list:

      if (this_weight_vector_tuple_1 != this_weight_vector_tuple_2):
        dist = euclidean_dist(this_weight_vector_tuple_1, \
                              this_weight_vector_tuple_2, num_weights)
        dist_sum += dist
    # No need to get average as all sums over same number of number of vectors
    #
    dist_list.append((dist_sum, this_weight_vector_tuple_1))

  dist_list.sort()  # Smallest first

  # A dictionary of the so far selected weight vectors
  #
  selected_weight_vector_dict = {}

  # k first elements with smallest distance sums
  #
  for (dist_sum, weight_vector_tuple) in dist_list[:k]:
    selected_weight_vector_dict[weight_vector_tuple] = \
                                         weight_vec_dict[weight_vector_tuple]

  print '  The selected "densest" weight vectors are:'
  for sel_weight_vector in selected_weight_vector_dict:
    print '    %s (%s)' % (weight_vector_to_str(sel_weight_vector),
          selected_weight_vector_dict[sel_weight_vector][1])
  print

  return selected_weight_vector_dict

# -----------------------------------------------------------------------------
# Function to select k weight vectors based on agglomerative clustering (get
# k clusters, then select most central weight vector in each cluster)
#
def select_agglomerative(weight_vec_dict, k, num_weights):
  """Return k selected weight vectors from the given weight vector dictionary
     using an agglomerative clustering approach.

     If the weight vector dictionary is too large then sampling of weight
     vectors is applied.

     Returns a new dictionary that contains k weight vectors.
  """

  num_weight_vec = len(weight_vec_dict)

  print 'Agglomerative clustering based selection of' + \
        ' %d weight vectors from %d vectors' % (k, num_weight_vec)

  weight_vector_tuple_list = weight_vec_dict.keys()

  if (num_weight_vec > k*NUM_SAMPLE):
    num_weight_vec = k*NUM_SAMPLE
    print '  Randomly select %d weight vectors for clustering' % \
          (num_weight_vec)
    random.shuffle(weight_vector_tuple_list)
    weight_vector_tuple_list = weight_vector_tuple_list[:num_weight_vec]

  # Prepare the data for clustering
  #
  cluster_data = numpy.zeros([num_weight_vec, num_weights])

  i = 0
  for weight_vector_tuple in weight_vector_tuple_list:
    cluster_data[:][i] = weight_vector_tuple
    i += 1

  aggl_cluster =   sklearn.cluster.Ward(n_clusters=k, compute_full_tree=False)
  cluster_labels = aggl_cluster.fit_predict(cluster_data)

  assert max(cluster_labels) == (k-1), (max(cluster_labels), k)

  cluster_centroid_list = []

  for i in range(k):
    cluster_centroid_list.append(numpy.zeros(num_weights))

  cluster_size_list = [0]*k

  for i in range(num_weight_vec):
    cluster_num = cluster_labels[i]
    cluster_size_list[cluster_num] += 1
#    this_weight_vector = cluster_data[:][i]

    for j in range(num_weights):
      cluster_centroid_list[cluster_num][j] += cluster_data[i][j]

  assert sum(cluster_size_list) == num_weight_vec
  print '  Cluster sizes:', cluster_size_list

  # Normalise cluster centroids
  #
  for i in range(k):
    for j in range(num_weights):
      cluster_centroid_list[i][j] /= cluster_size_list[i]

  for i in range(k):  # Check all cluster centroids contain normalised values
    assert max(cluster_centroid_list[i]) <= 1.0
    assert min(cluster_centroid_list[i]) >= 0.0

  # Find weight vector closest to each cluster centroid
  #
  centroid_closest_weight_vector_dict = {}  # One per cluster

  for weight_vector_tuple in weight_vec_dict:  # Loop over all weight vectors

    for i in range(k):
      cluster_centroid = cluster_centroid_list[i]
      cluster_min_dist, closest_weight_vector = \
                        centroid_closest_weight_vector_dict.get(i, [99999, []])
      this_dist = euclidean_dist(weight_vector_tuple, cluster_centroid, \
                                 num_weights)
      if (this_dist < cluster_min_dist):
        centroid_closest_weight_vector_dict[i] = (this_dist, \
                                                  weight_vector_tuple)

  # A dictionary of the selected weight vectors
  #
  selected_weight_vector_dict = {}

  for (cluster_min_dist, closest_weight_vector) in \
                             centroid_closest_weight_vector_dict.itervalues():
    selected_weight_vector_dict[closest_weight_vector] = \
                                         weight_vec_dict[closest_weight_vector]

  print '  The selected weight vectors are:'
  for sel_weight_vector in selected_weight_vector_dict:
    print '    %s (%s)' % (weight_vector_to_str(sel_weight_vector),
          selected_weight_vector_dict[sel_weight_vector][1])
  print

  return selected_weight_vector_dict

# -----------------------------------------------------------------------------
# Function to select k weight vectors randomly
#
def select_random(weight_vec_dict, k):
  """Return k randomly selected weight vectors from the given weight vector
     dictionary.

     Returns a new dictionary that contains k weight vectors.
  """

  print 'Random selection of %d weight vectors from %d vectors' % \
        (k, len(weight_vec_dict))

  sel_weight_vec_list = random.sample(weight_vec_dict.keys(), k)

  # A dictionary of the so far selected weight vectors
  #
  selected_weight_vector_dict = {}

  for weight_vec_tuple in sel_weight_vec_list:
    selected_weight_vector_dict[weight_vec_tuple] = \
                                             weight_vec_dict[weight_vec_tuple]

  assert len(selected_weight_vector_dict) == k

  print '  The randomly selected weight vectors are:'
  for sel_weight_vector in selected_weight_vector_dict:
    print '    %s (%s)' % (weight_vector_to_str(sel_weight_vector),
          selected_weight_vector_dict[sel_weight_vector][1])
  print

  return selected_weight_vector_dict

# -----------------------------------------------------------------------------
# Function to find and return the most central weight vectors
#
def select_medoid(weight_vec_dict, k, num_weights):
  """Return the k most central weight vectors as a dictionary, calculated as
     those closest to the centroid.
  """

  print 'Select %d medoids from %d weight vectors' % (k, len(weight_vec_dict))

  if (k == len(weight_vec_dict)):  # Return all weight vectors
    selected_weight_vector_dict = weight_vec_dict.copy()
    return selected_weight_vector_dict

  num_weight_vec = len(weight_vec_dict)

  # First calculate the centroid of the given set of weight vectors
  #
  centroid = [0.0]*num_weights

  for weight_vector_tuple in weight_vec_dict:
    for w in range(num_weights):
      centroid[w] += weight_vector_tuple[w]

  for w in range(num_weights):  # Calculate averages
    centroid[w] /= num_weight_vec

  print '  Centroid weight vector:', weight_vector_to_str(centroid)

  medoid_list = []
  max_medoid_dist = -1.0  # Maximum distance of any medoid to the centroid

  for weight_vector_tuple in weight_vec_dict:
    centroid_dist = euclidean_dist(centroid, weight_vector_tuple, num_weights)

    if (len(medoid_list) < k):  # List can grow
      medoid_list.append([centroid_dist, weight_vector_tuple])
      if (centroid_dist > max_medoid_dist):
        max_medoid_dist = centroid_dist
      medoid_list.sort()
    elif (centroid_dist < max_medoid_dist): # We have to replace a list element
      medoid_list = medoid_list[:-1]  # Remove furthest away weight vector
      medoid_list.append([centroid_dist, weight_vector_tuple])
      medoid_list.sort()
      max_medoid_dist = medoid_list[-1][0]

  medoid_dict = {}

  print '  %d closest weight vectors:' % (k)
  for (dist, weight_vector_tuple) in medoid_list:
    medoid_dict[weight_vector_tuple] = weight_vec_dict[weight_vector_tuple]
    print '    With distance of %.4f: %s (%s)' % \
          (dist, weight_vector_to_str(weight_vector_tuple), \
           weight_vec_dict[weight_vector_tuple][1])
  print

  assert len(medoid_dict) == k

  return medoid_dict

# -----------------------------------------------------------------------------
# Function to perform the oracle
# 
def oracle(weight_vec_dict, acc):
  """Assume a classification (manually) of the given accuracy on the given
     weight vectors.

     The function returns two dictionaries, one of the classified matches and
     one of the classified non matches, the `purity' of classification as the
     maximum of:
      - percentage of weight vectors that are classified as matches,
      - percentage of weight vectors that are classified as non-matches,
     and the entropy of the classification calculated as:

     entropy = - {(M/(M+NM)) * log2(M/(M+NM)) + (NM/(M+NM)) * log2(NM/(M+NM))}

     where M is the number of matches and NM the number of non-matches.

     Purity is a value between 0.5 and 1.0.

     'acc' of these will be correct (i.e. according to their true match status,
     while 1.0 - 'acc' will be wrong.
  """

  print 'Perform oracle with %.2f accuracy on %d weight vectors' % \
        (100.0*acc, len(weight_vec_dict))

  match_dict =     {}
  non_match_dict = {}

  num_weight_vec = len(weight_vec_dict)

  num_correct = int(round(acc*num_weight_vec))
  print '  The oracle will correctly classify %d weight vectors and ' % \
        (num_correct) + 'wrongly classify %d' % (num_weight_vec-num_correct)

  weight_vector_list = weight_vec_dict.keys()
  random.shuffle(weight_vector_list)

  # Make sure the weight vectors are unique
  #
  assert len(set(weight_vector_list)) == len(weight_vector_list)

  # Get the list of weight vectors to be classified correctly and wrongly
  #
  corr_list = random.sample(weight_vector_list, num_correct)
  wrong_list = list(set(weight_vector_list) - set(corr_list))
  assert len(set(corr_list).intersection(set(wrong_list))) == 0

  num_tp = 0
  num_fp = 0
  num_tn = 0
  num_fn = 0

  # Perform the oracle classification
  #
  for weight_vector_tuple in corr_list:  # The correct classification
    weight_vector_count_match_list = weight_vec_dict[weight_vector_tuple]
    if (weight_vector_count_match_list[1] == True):  # A true match
      match_dict[weight_vector_tuple] = weight_vector_count_match_list
      num_tp += 1
    else:
      non_match_dict[weight_vector_tuple] = weight_vector_count_match_list
      num_tn += 1

  for weight_vector_tuple in wrong_list:  # The wrong classification
    weight_vector_count_match_list = weight_vec_dict[weight_vector_tuple]
    if (weight_vector_count_match_list[1] == True):  # A true match
      non_match_dict[weight_vector_tuple] = weight_vector_count_match_list
      num_fn += 1
    else:
      match_dict[weight_vector_tuple] = weight_vector_count_match_list
      num_fp += 1

  m =  float(len(match_dict))       # Number of matches
  nm = float(len(non_match_dict))   # Number of non-matches
  a =  float(len(weight_vec_dict))  # Number of all

  purity = max(m/a, nm/a)
  assert purity >= 0.5 and purity <= 1.0, purity

  if (m != 0.0) and (nm != 0.0):
    entropy = - m/a * math.log(m/a, 2) - nm/a * math.log(nm/a, 2)
  else:
    entropy = 0.0

  assert entropy >= 0.0 and entropy <= 1.0, entropy

  print '  Classified %d matches and %d non-matches' % \
        (len(match_dict), len(non_match_dict))
  print '    Purity of oracle classification:  %.3f' % (purity)
  print '    Entropy of oracle classification: %.3f' % (entropy)
  print '    Number of true matches:     ', num_tp
  print '    Number of false matches:    ', num_fp
  print '    Number of true non-matches: ', num_tn
  print '    Number of false non-matches:', num_fn
  print
  assert num_tp+num_fp == len(match_dict)
  assert num_tn+num_fn == len(non_match_dict)

  if (len(match_dict) == 0):
    print '*** Warning: Oracle returns an empty match dictionary ***'
  if (len(non_match_dict) == 0):
    print '*** Warning: Oracle returns an empty non-match dictionary ***'

  return match_dict, non_match_dict, purity, entropy

# -----------------------------------------------------------------------------
# Function to perform k nearest neighbour classification and split the cluster
#
def knn_split_classifier(weight_vector_dict, k, match_dict, non_match_dict):
  """Split the given weight vector dictionary into matches and non-matches
     according to the match and non-match training dictionaries given using
     a k nearest neighbour classifier.

     Returns two new weight vector dictionaries, one for the classified matches
     and the other for the classified non-matches.
  """

  assert k%2 == 1, k  # k must be odd

  print '%d-NN classification of %d weight vectors' % \
        (k, len(weight_vector_dict))
  print '  Based on %d matches and %d non-matches' % \
        (len(match_dict), len(non_match_dict))

  # The two dictionaries of classified weight vectors to be generated
  #
  match_weight_vector_dict =     {}
  non_match_weight_vector_dict = {}

  for (weight_vector_tuple, weight_vector_count_match_list) in \
                                              weight_vector_dict.iteritems():

    this_dist_list = []  # List of this weight vector's distances to all
                         # training vectors

    for train_weight_vector in match_dict.iterkeys():
      dist = euclidean_dist(weight_vector_tuple, train_weight_vector,
                            num_weights)
      this_dist_list.append((dist, 'm'))  # Distance to a match

    for train_weight_vector in non_match_dict.iterkeys():
      dist = euclidean_dist(weight_vector_tuple, train_weight_vector,
                            num_weights)
      this_dist_list.append((dist, 'nm'))  # Distance to a non-match

    this_dist_list.sort()
    this_dist_list_k = this_dist_list[:k]  # k closest training vectors

    num_matches, num_non_matches = 0, 0

    for (dist, match_status) in this_dist_list_k:
      if (match_status == 'm'):
        num_matches += 1
      else:
        num_non_matches += 1

    if (num_matches > num_non_matches):
      match_weight_vector_dict[weight_vector_tuple] = \
                                                 weight_vector_count_match_list
    else:
      non_match_weight_vector_dict[weight_vector_tuple] = \
                                                 weight_vector_count_match_list

  assert len(match_weight_vector_dict) + len(non_match_weight_vector_dict) == \
         len(weight_vector_dict)

  print '  Classified %d matches and %d non-matches' % \
        (len(match_weight_vector_dict), len(non_match_weight_vector_dict))
  print

  return match_weight_vector_dict, non_match_weight_vector_dict

# -----------------------------------------------------------------------------
# Function to perform SVM classification and split the cluster
#
def svm_split_classifier(weight_vector_dict, match_dict, non_match_dict,
                         num_weights):
  """Split the given weight vector dictionary into matches and non-matches
     according to the match and non-match training dictionaries given using an
     SVM classifier.

     Returns two new weight vector dictionaries, one for the classified matches
     and the other for the classified non-matches.
  """

  print 'SVM classification of %d weight vectors' % (len(weight_vector_dict))
  print '  Based on %d matches and %d non-matches' % \
        (len(match_dict), len(non_match_dict))

  # Prepare the training and test data sets for the classifier
  #
  num_train_vec = len(match_dict) + len(non_match_dict)
  num_test_vec =  len(weight_vector_dict)

  train_data =    numpy.zeros([num_train_vec, num_weights])
  train_class =   numpy.zeros(num_train_vec)
  train_weights = numpy.zeros(num_train_vec)

  test_data =  numpy.zeros([num_test_vec, num_weights])
  test_class = numpy.zeros(num_test_vec)

  j = 0
  for (weight_vector_tuple, weight_vector_count_match_list) in \
                                   match_dict.iteritems():
    train_data[:][j] = weight_vector_tuple
    train_class[j] =   1.0
    train_weights[j] = weight_vector_count_match_list[0]
    assert train_weights[j] >= 1
    j += 1
  for (weight_vector_tuple, weight_vector_count_match_list) in \
                                non_match_dict.iteritems():
    train_data[:][j] = weight_vector_tuple
    train_class[j] =   0.0
    train_weights[j] = weight_vector_count_match_list[0]
    assert train_weights[j] >= 1
    j += 1

  weight_vectors_to_classify_list = weight_vector_dict.items()

  j = 0
  for (weight_vector_tuple, weight_vector_count_match_list) in \
                                   weight_vectors_to_classify_list:
    test_data[:][j] = weight_vector_tuple
    match_status = weighted_unique_weight_vec_dict[weight_vector_tuple][1]
    assert match_status in [True, False]
    if (match_status == True):
      test_class[j] = 1.0
    else:
      test_class[j] = 0.0
    j += 1

  classifier = sklearn.svm.SVC(kernel='linear', C=0.1)
  classifier.fit(train_data, train_class, sample_weight=train_weights)
  class_predict = classifier.predict(test_data)

  # The two dictionaries of classified weight vectors to be generated
  #
  match_weight_vector_dict =     {}
  non_match_weight_vector_dict = {}

  num_matches, num_non_matches = 0, 0

  for i in range(num_test_vec):
    weight_vector_tuple, weight_vector_count_match_list = \
                                            weight_vectors_to_classify_list[i]
    if (class_predict[i] == 1.0):
      num_matches += 1
      match_weight_vector_dict[weight_vector_tuple] = \
                                                weight_vector_count_match_list
    else:
      num_non_matches += 1
      non_match_weight_vector_dict[weight_vector_tuple] = \
                                                weight_vector_count_match_list

  assert len(match_weight_vector_dict) + len(non_match_weight_vector_dict) == \
         len(weight_vector_dict)

  print '  Classified %d matches and %d non-matches' % \
        (len(match_weight_vector_dict), len(non_match_weight_vector_dict))
  print

  return match_weight_vector_dict, non_match_weight_vector_dict

# -----------------------------------------------------------------------------
# Function to perform decision tree classification and split the cluster
#
def dtree_split_classifier(weight_vector_dict, match_dict, non_match_dict,
                           num_weights):
  """Split the given weight vector dictionary into matches and non-matches
     according to the match and non-match training dictionaries given using a
     decision tree classifier.

     Returns two new weight vector dictionaries, one for the classified matches
     and the other for the classified non-matches.
  """

  print 'Decision tree classification of %d weight vectors' % \
        (len(weight_vector_dict))
  print '  Based on %d matches and %d non-matches' % \
        (len(match_dict), len(non_match_dict))

  # Prepare the training and test data sets for the classifier
  #
  num_train_vec = len(match_dict) + len(non_match_dict)
  num_test_vec =  len(weight_vector_dict)

  train_data =    numpy.zeros([num_train_vec, num_weights])
  train_class =   numpy.zeros(num_train_vec)
  train_weights = numpy.zeros(num_train_vec)

  test_data =  numpy.zeros([num_test_vec, num_weights])
  test_class = numpy.zeros(num_test_vec)

  j = 0
  for (weight_vector_tuple, weight_vector_count_match_list) in \
                                   match_dict.iteritems():
    train_data[:][j] = weight_vector_tuple
    train_class[j] =   1.0
    train_weights[j] = weight_vector_count_match_list[0]
    assert train_weights[j] >= 1
    j += 1
  for (weight_vector_tuple, weight_vector_count_match_list) in \
                                non_match_dict.iteritems():
    train_data[:][j] = weight_vector_tuple
    train_class[j] =   0.0
    train_weights[j] = weight_vector_count_match_list[0]
    assert train_weights[j] >= 1
    j += 1

  weight_vectors_to_classify_list = weight_vector_dict.items()

  j = 0
  for (weight_vector_tuple, weight_vector_count_match_list) in \
                                   weight_vectors_to_classify_list:
    test_data[:][j] = weight_vector_tuple
    match_status = weighted_unique_weight_vec_dict[weight_vector_tuple][1]
    assert match_status in [True, False]
    if (match_status == True):
      test_class[j] = 1.0
    else:
      test_class[j] = 0.0
    j += 1

  classifier = sklearn.tree.DecisionTreeClassifier(criterion='gini')
  classifier.fit(train_data, train_class, sample_weight=train_weights)
  class_predict = classifier.predict(test_data)

  # The two dictionaries of classified weight vectors to be generated
  #
  match_weight_vector_dict =     {}
  non_match_weight_vector_dict = {}

  num_matches, num_non_matches = 0, 0

  for i in range(num_test_vec):
    weight_vector_tuple, weight_vector_count_match_list = \
                                            weight_vectors_to_classify_list[i]
    if (class_predict[i] == 1.0):
      num_matches += 1
      match_weight_vector_dict[weight_vector_tuple] = \
                                                weight_vector_count_match_list
    else:
      num_non_matches += 1
      non_match_weight_vector_dict[weight_vector_tuple] = \
                                                weight_vector_count_match_list

  assert len(match_weight_vector_dict) + len(non_match_weight_vector_dict) == \
         len(weight_vector_dict)

  print '  Classified %d matches and %d non-matches' % \
        (len(match_weight_vector_dict), len(non_match_weight_vector_dict))
  print

  return match_weight_vector_dict, non_match_weight_vector_dict

# -----------------------------------------------------------------------------
# Function to calculate distance of a cluster from 0 corner
#
def cluster_0_dist(weight_vector_dict, num_weights, dist_mode):
  """Calculate distance of a cluster from the [0] corner using one of minimum,
     average or maximum distance (corresponding to single, average or complete
     link), based on Euclidean distance.

     num_weights gives the dimensionality of the weight vectors.

     Returns a numerical distance value.
  """

  assert dist_mode in ['single', 'average', 'complete'], dist_mode

  zero_vec = [0]*num_weights

  min_dist = 999999.99

  if (dist_mode == 'single'):
    for weight_vector_tuple in weight_vector_dict:
      print weight_vector_tuple
      this_dist = euclidean_dist(zero_vec, weight_vector_tuple, num_weights)
      if (this_dist < min_dist):
        min_dist = this_dist

  elif (dist_mode == 'complete'):
    for weight_vector_tuple in weight_vector_dict:
      print weight_vector_tuple
      this_dist = euclidean_dist(zero_vec, weight_vector_tuple, num_weights)
      if (this_dist > min_dist):
        min_dist = this_dist

  else:  # Calculate average distance
    num_weight_vec = len(weight_vector_dict)
    avrg_dist_vec = [0.0]*num_weights
    for weight_vector_tuple in weight_vector_dict:
      for i in range(num_weights):
        avrg_dist_vec[i] += weight_vector_tuple[i]
    for i in range(num_weights):
      avrg_dist_vec[i] /= num_weight_vec
    min_dist = euclidean_dist(zero_vec, avrg_dist_vec, num_weights)

  assert min_dist >= 0, min_dist

  return min_dist

# -----------------------------------------------------------------------------
# Function to calculate number of samples needed from a cluster to obtain
# enough examples for a given confidence and margin of error
#
def get_sample_size(cluster_size, est_proportion, sample_error):
  """Follows equation on page 505 of An Introduction to Statistical Methods
     and Data analysis, Ott and Longnecker, 6th edition.

     We assume a confidence level of 95%.
  """

  z_alpha2 = 1.95  # For 95% confidence interval

  sample_size = z_alpha2**2 * est_proportion * (1.0 - est_proportion) \
                / (sample_error**2)
  #print 'sample size:', sample_size,cluster_size,est_proportion,sample_error

  # For small cluster sizes we adjust:
  #
  sample_size_adj = cluster_size*sample_size / (cluster_size + sample_size)

  #print 'Sample size:', int(math.ceil(sample_size_adj))

  #if (sample_size != sample_size_adj):
  #  print '  Original sample size: %.2f and adjusted sample size: %.2f' % \
  #        (sample_size, sample_size_adj)
  #print

  return int(math.ceil(sample_size_adj))

def atualizaDM_NDM(dic, file1, file2):

    dmfile  = open(file1, 'ab')
    ndmfile = open(file2, 'ab')

    writer_dm = csv.writer(dmfile, delimiter=';', quotechar = ' ')
    writer_ndm = csv.writer(ndmfile, delimiter=';', quotechar = ' ')


    for pesos in oracle_class_cache_set:

        for k,v in dic.iteritems(): 
            if pesos == v[1]:

                ids = k.split('~')

                if v[0]:
                    print 'Verdadeiro!'
                    writer_dm.writerow(ids)
                else:
                    print 'Falso!'
                    writer_ndm.writerow(ids)

                break

    dmfile.close()
    ndmfile.close()
    


#Geração do conjunto de treinamento
def geraTrainSet(dic, dir, file1):

    #cont = 0
    
#     trainSet  = open(dir+file1, 'ab')
    trainSet  = open(dir+file1, 'wb')

    writer_ts = csv.writer(trainSet, delimiter=';', quotechar = ' ')
 
    for pesos in oracle_class_cache_set:

        for k,v in dic.iteritems(): 
            if pesos == v[1]:
                ids = k.split('~')

                if v[0]:
                    #print 'Verdadeiro!'
                    #lista = [cont] + ids + list(v[1]) + [1.0]
                    lista = list(v[1]) + [1.0]
                    writer_ts.writerow(lista)
                else:
                    #print 'Falso!'
                    #lista = [cont] + ids + list(v[1]) + [0.0]
                    lista = list(v[1]) + [0.0]
                    writer_ts.writerow(lista)
                #cont = cont + 1
                break
        
    trainSet.close()
	
#Geração do conjunto de treinamento
def geraTestSet(dic, dir, file1):

    #cont = 0
    cont1 = 0
    
#     testSet  = open(dir+file1, 'ab')
    testSet  = open(dir+file1, 'wb')

    writer_ts = csv.writer(testSet, delimiter=';', quotechar = ' ')
 
	#Remoção dos vetores selecionados para o conjunto de treinamento
    for pesos in oracle_class_cache_set:

        for k,v in dic.iteritems(): 
            if pesos == v[1]:
                
                del dic[k]
                cont1 = cont1 + 1
                break
	
	#print 'Número de deleções: %d' %(cont1)
	
	#Geração do conjunto de teste
    for k,v in dic.iteritems(): 
            
        ids = k.split('~')

        if v[0]:
            #lista = [cont] + ids + list(v[1]) + [1.0]
            lista = list(v[1]) + [1.0]
            writer_ts.writerow(lista)
        else:
            #lista = [cont] + ids + list(v[1]) + [0.0]
            lista = list(v[1]) + [0.0]
            writer_ts.writerow(lista)
        #cont = cont + 1
        
        
    testSet.close()
    
# def atualizaEstatAA(permutacao, dir):
    
#     With open(dir, rb) as f:
#     reader = csv.reader(f)
#     for row in reader:
#         if row[2] = permutacao #Coluna da permutação
#             print
#             print row

# import pandas as pd

#Acho que não vai ser uma função.
#Melhor criar o dataframe antes da primeira iteração ea atualizá-lo ao término de cada uma
#Depois disso fecha o arquivo e salva
# def atualizaEstatAA(permutacao, arq):
    
#     estatisticas = pd.read_csv(arq)
    
#     estatisticas.set_index('permutacao')
        
#         linha = estatisticas.loc[permutacao, : ] #Armazena a linha correspondente à permutação
        
#         tp = linha[: tp] + tp 
        #Idem para fp, tn e fn
        #calcula as métricas além de armazenar os dados de inspecoesManuais, dm, ndm e permutacao e a etapa "AA"
        
        #Armazena no final do arquivo
        
        


# def setVetorSim(nome):
#     print 'Entrei em setVetorSim() com o arquivo %s' %(nome)
#     weight_vec_file_name = nome
#     print 'weight_vec_file_name = %s' %(weight_vec_file_name)
    
# def getNomeVetorSim():
#     return weight_vec_file_name

# =============================================================================
# Main program

# Step 1: Load, analyse and clean the weight vectors file
#

#Repetições para os experimentos

# dirOrig = "../csv/conjuntosDS/conjuntosDiverg/"
dirOrig = "../csv/conjuntosDS/conjuntosDivergAA/"
estat = "../csv/estatisticaInicialDS.csv"

# Diretórios para Windows
# dirOrig = "C:\Users\Diego\Documents\NetBeansProjects\Master-SKYAM\AS\src\csv\conjuntosDS\conjuntosDiverg\\"
# estat = "C:\Users\Diego\Documents\NetBeansProjects\Master-SKYAM\AS\src\csv\estatisticaInicialDS.csv"


# dirOrig = "..\..\Documents\NetBeansProjects\Master-SKYAM\AS\src\csv\conjuntosDS\conjuntosDiverg"
# estat = "..\..\Documents\NetBeansProjects\Master-SKYAM\AS\src\csv\estatisticaInicialDS.csv"

# Diretórios para Linux
# dirOrig = "./arqResult/csv/conjuntosDS/conjuntosDiverg/"
# estat = "./arqResult/csv/estatisticaInicialDS.csv"

# /home/diego/anaconda3/rerequestingofalgorithmsforconductingresearch/arqResult/csv

# /home/diego/anaconda3/rerequestingofalgorithmsforconductingresearch/arqResult/csv

etapa = '2 - AA[pet-chr]'

import pandas as pd
import re

estatisticas = pd.read_csv(estat, index_col=['algoritmosUtilizados', 'etapa', 'permutacao'], sep=';')
# estatisticas.set_index('permutacao')
estatisticas.head()
print(estatisticas.columns)


print 'estatisticas.shape'
print estatisticas.shape

arquivos = [] #Adicionado depois

for _, _, arquivo in os.walk(dirOrig):
     #print(arquivo)
     arquivos.extend(arquivo)   

#print 'quantidade de arquivos: %d' %(len(arquivos))        
        
#linhaAtual
#cont = 0
    
#for arq in arquivo:
for arq in arquivos:
    #print 'Diego %d' %(cont)
    
    if '_NEW' in arq:
        print 'Analisando o arquivo: %s' %(arq)
        #cont+=1
#         num = arq.replace('diverg','')
#         num = num.replace('_NEW.csv','')
#'''

        num = re.sub('diverg.*\)', r'', arq) #Alterar para fazer a substituição de tudo em uma linha só
        num = num.replace('_NEW.csv','')
#         print(num)

        algUtl = re.sub('diverg.*\(', r'', arq) #Alterar para fazer a substituição de tudo em uma linha só
        algUtl = re.sub('\).*', r'', algUtl) #Alterar para fazer a substituição de tudo em uma linha só
#         alg = alg.replace('\).*','')
#         print 'alg: %s' %(algUtl)
        algUtl = int(algUtl)
        #Passando o csv para selecionar os vetores para conjunto de treinamento
#         setVetorSim(arq)
        
#         print 'Está sendo analisado o arquivo %s' %(getNomeVetorSim())

#       
        permutacao = int(num)
    
    
#         linhaAtual = estatisticas.xs((algUtl, '1 - acm diverg', permutacao))
        linhaAtual = estatisticas.xs((algUtl, '1 - acm diverg', permutacao))
        #linhaAtual = estatisticas.loc[algUtl, '1 - acm diverg', permutacao, : ] #Armazena a linha correspondente à permutação
#         linhaAtual = estatisticas.loc[['1 - acm diverg', permutacao], : ] #Armazena a linha correspondente à permutação
        
#         linhaAtual.tolist()
        print type(linhaAtual)
        print 'Linha atual aqui, jovem!'
        print linhaAtual.shape
        print linhaAtual
#         print 'colunas'
#         print linhaAtual.columns
#         print linhaAtual
        #print type(linhaAtual['precision'])#[:] #.get_value
        #print linhaAtual['precision'].item()#[:] #.get_value
        
#         linha

        

    
        start_time = time.time()

#         weights_name_list, weight_vector_dict = \
#                         load_weight_vec_file(weight_vec_file_name, weights_to_use_list)
        weights_name_list, weight_vector_dict = \
                        load_weight_vec_file(dirOrig+arq, weights_to_use_list)
        file_num_weight_vectors = len(weight_vector_dict)

          #Adicionado por Diego

        #weight_vector_dict_orig = copy.deepcopy(weight_vector_dict)
        weight_vector_dict_orig = dict(weight_vector_dict)

        weight_vector_dict, weighted_unique_weight_vec_dict, num_true_matches, \
               num_true_non_matches = analyse_filter_weight_vectors(weight_vector_dict)

        unique_num_weight_vectors = len(weighted_unique_weight_vec_dict)

        data_prep_time = time.time() - start_time
        print 'Time to load and analyse the weight vector file: %.2f sec' % \
             (data_prep_time)
        print

        # A flag, set to True once the first (initial) selection has been done (as the
        # initial selection function is different from all following ones)
        #
        init_selection_done = False

        cluster_queue = []  # Clusters of weight vectors that need to be split further

        cluster_size_list =     []  # Collect statistics about cluster sizes, their
        cluster_pureness_list = []  # pureness, entropy and time required
        cluster_entropy_list =  []
        cluster_use_pure_list = []  # Purness of only clusters used for training
        cluster_sample_size =   []  # Number of samples required per cluster
        loop_sel_time_list =    []
        loop_oracle_time_list = []
        loop_class_time_list =  []
        num_clusters_used =      0  # How many clusters were used for training

        # Calculate initial estimated proportion of matches as minimum data set size
        # divided by number of weight vectors
        #
        cluster_est_proportion = float(min_data_set_size) / unique_num_weight_vectors

        cluster_est_proportion = 0.5  # Gives largest possible sample

        print 'Initial estimated match proportion: %.3f' % \
              (cluster_est_proportion)
        print

        # The queue contains tuples with:
        # (weight vector dictionary, cluster purity, cluster_entropy, cluster size,
        # cluster_est_proportion)

        # Start with all weight vectors in one cluster

        # For the initial cluster we set purity to 0.5 (minimum possible value) as it
        # is unknown; similar maximum entropy is 1.0
        #
        cluster_queue.append((weighted_unique_weight_vec_dict, 0.5, 1.0,
                              len(weighted_unique_weight_vec_dict),
                              cluster_est_proportion))

        # Keep a set of all weight vectors 'manually' classified by the oracle (so we
        # can check the budget and stop once maximum budget is reached)
        #
        oracle_class_cache_set = set()

        # Two dictionaries with the weight vectors selected as training data
        #
        final_match_weight_vector_dict =     {}
        final_non_match_weight_vector_dict = {}

        # Step 2: Recursively split clusters until stopping criteria achieved
        #
        loop_count = 0

        while (cluster_queue != []):  # As long as we have clusters to be split further
          loop_count += 1

          print '- '*40
          print 'Loop %d: Queue length: %d' % (loop_count, len(cluster_queue))
          print '  Number of manual oracle classifications performed:', \
                len(oracle_class_cache_set)
          print '  Size, purity, entropy, and estimated match proportion of ' + \
                'clusters in queue:'
          for cluster_tuple in cluster_queue:
            print '   ', (cluster_tuple[3], cluster_tuple[1], cluster_tuple[2], \
                          cluster_tuple[4])
          print
          print 'Current size of match and non-match training data sets: %d / %d' % \
                (len(final_match_weight_vector_dict),
                 len(final_non_match_weight_vector_dict))
          print

          # Step 2a: Select a cluster from the queue

          # Extension January 2015: Different orderings of the blocks in the queue
          # 'fifo' First in first out - our current approach
          # 'random' Random order - use random.choice()
          # 'max_puri'  Clusters with highest purity first (purity from parent cluster)
          # 'min_puri'  Clusters with lowest purity first (purity from parent cluster)
          # 'max_size'  Largest clusters first
          # 'min_size'  Smallest clusters first
          # 'min_entr'  Clusters with lowest entropy first (based on entropy of parent
          #             cluster, as child cluster entropy can only be smaller)
          # 'max_entr'  Clusters with highest entropy first (again based on parent
          #             cluster entropy)
          # 'max_size'  Largest clusters first
          # 'min_size'  Smallest clusters first
          # 'close_01'  Closest to 0 or 1 corner
          # 'close_mid' Closest to middle (half between 0 and 1 corners)
          # 'balance'   Select so that training set sizes are balanced
          # 'sample'    Select cluster which has largest ratio of cluster size divided
          #             by number of number of samples required

          # Need to calculate distances of all clusters to 0 corner first
          #
          if (queue_order in ['close_01','close_mid','balance']):  

            cluster_dist_list = []  # Pairs of distances and corresponding clusters

            for cluster_tuple in cluster_queue:
              cluster_dist = cluster_0_dist(cluster_tuple[0], num_weights, \
                                            CLUSTER_MIN_DIST)
              if (queue_order == 'close_01'):
                abs_dist = min(cluster_dist, 1.0-cluster_dist)  # Distance from corner
                cluster_dist_list.append((abs_dist, cluster_tuple))
              elif (queue_order == 'close_mid'):
                abs_dist = abs(0.5-cluster_dist)  # Distance from middle
                cluster_dist_list.append((abs_dist, cluster_tuple))
              elif (queue_order == 'balance'):
                cluster_dist_list.append((cluster_dist, cluster_tuple))
              else:
                raise Exception, queue_order

            cluster_dist_list.sort()  # Smallest distance first

            if (queue_order in ['close_01', 'close_mid']):  # Take first
              next_cluster_tuple = cluster_dist_list[0][1]
              cluster_queue.remove(next_cluster_tuple)
            else:  # Select closest depending on size of current training data sets
              num_matches =     len(final_match_weight_vector_dict)
              num_non_matches = len(final_non_match_weight_vector_dict)

              # Use <= to favour selecting likely matches over non-matches
              #
              if (num_matches <= num_non_matches): # Select cluster closest to 1 corner
                print '  Balanced cluster selection, select cluster closest to' + \
                      ' [1,..,1]'
                print
                next_cluster_tuple = cluster_dist_list[-1][1]
                cluster_queue.remove(next_cluster_tuple)
              else:
                print '  Balanced cluster selection, select cluster closest to' + \
                      '[0,...,0]'
                print
                next_cluster_tuple = cluster_dist_list[0][1]
                cluster_queue.remove(next_cluster_tuple)

          elif (queue_order == 'fifo'):
            next_cluster_tuple = cluster_queue.pop(0)

          elif (queue_order == 'random'):
            next_cluster_tuple = random.choice(cluster_queue)
            cluster_queue.remove(next_cluster_tuple)

          elif (queue_order == 'max_puri'):
            check_max_purity = 0.0
        #    next_cluster_tuple = None
            for cluster_tuple in cluster_queue:
              if (cluster_tuple[1] > check_max_purity):
                check_max_purity = cluster_tuple[1]
                next_cluster_tuple = cluster_tuple
            cluster_queue.remove(next_cluster_tuple)

          elif (queue_order == 'min_puri'):
            check_min_purity = 2.0
        #    next_cluster_tuple = None
            for cluster_tuple in cluster_queue:
              if (cluster_tuple[1] < check_min_purity):
                check_min_purity = cluster_tuple[1]
                next_cluster_tuple = cluster_tuple
            cluster_queue.remove(next_cluster_tuple)

          elif (queue_order == 'max_entr'):
            check_max_entropy = -1.0
        #    next_cluster_tuple = None
            for cluster_tuple in cluster_queue:
              if (cluster_tuple[2] > check_max_entropy):
                check_max_entropy = cluster_tuple[2]
                next_cluster_tuple = cluster_tuple
            cluster_queue.remove(next_cluster_tuple)

          elif (queue_order == 'min_entr'):
            check_min_entropy = 2.0
        #    next_cluster_tuple = None
            for cluster_tuple in cluster_queue:
              if (cluster_tuple[2] < check_min_entropy):
                check_min_entropy = cluster_tuple[2]
                next_cluster_tuple = cluster_tuple
            cluster_queue.remove(next_cluster_tuple)

          elif (queue_order == 'max_size'):
            check_max_size = -1.0
        #    next_cluster_tuple = None
            for cluster_tuple in cluster_queue:
              if (cluster_tuple[3] > check_max_size):
                check_max_size = cluster_tuple[2]
                next_cluster_tuple = cluster_tuple
            cluster_queue.remove(next_cluster_tuple)

          elif (queue_order == 'min_size'):
            check_min_size = 99999999.0
        #    next_cluster_tuple = None
            for cluster_tuple in cluster_queue:
              if (cluster_tuple[3] < check_min_size):
                check_min_size = cluster_tuple[2]
                next_cluster_tuple = cluster_tuple
            cluster_queue.remove(next_cluster_tuple)

          elif (queue_order == 'sample'):
            check_max_ratio = -1.0

            for cluster_tuple in cluster_queue:
              this_cluster_size = cluster_tuple[3]
              this_cluster_prop = cluster_tuple[4]
              this_select_num = get_sample_size(this_cluster_size,
                                                this_cluster_prop, sample_error)
              this_cluster_ratio = float(this_cluster_size) / this_select_num
              if (this_cluster_ratio > check_max_ratio):
                check_max_ratio = this_cluster_ratio
                next_cluster_tuple = cluster_tuple
            cluster_queue.remove(next_cluster_tuple)

        ## TODO: Add optimal ordering - combine balancing, sample size, purity etc.

          else:
            raise Exception, queue_order

          (next_weight_vec_dict, cluster_purity, cluster_entropy, cluster_size, \
                                          cluster_est_proportion) = next_cluster_tuple

          cluster_size_list.append(cluster_size)

          print 'Selected cluster with (queue ordering: %s):' % (queue_order)
          print '- Purity %.2f and entropy %.2f' % (cluster_purity, cluster_entropy)
          print '- Size %d weight vectors' % (cluster_size)
          print '- Estimated match proportion %.3f' % (cluster_est_proportion)
          print

          # Calculate sample size (number of weight vectors to select) for this cluster
          #
          select_num = get_sample_size(cluster_size, cluster_est_proportion, \
                                       sample_error)
          cluster_sample_size.append(select_num)

          print 'Sample size for this cluster:', select_num
          print

          # Step 2b: Get representative weight vectors from this cluster
          #
          start_time = time.time()

          if (init_selection_done == False):  # Do initial selection
            init_selection_done = True
            print 'Perform initial selection using "%s" method' % (init_method)
            print

            if (init_method == 'far'):
              rep_weight_vec_dict = select_farthest(next_weight_vec_dict, select_num, \
                                                    num_weights)
            elif (init_method == '01'):
              rep_weight_vec_dict = select_01(next_weight_vec_dict, select_num, \
                                              num_weights)
            elif (init_method == 'corner'):
              rep_weight_vec_dict = select_corners(next_weight_vec_dict, num_weights)

            else:  # Random
             rep_weight_vec_dict = select_random(next_weight_vec_dict, select_num)

          else:  # Any following selection
            if (select_method == 'far'):
              rep_weight_vec_dict = select_farthest(next_weight_vec_dict, select_num, \
                                                    num_weights)
            elif (select_method == 'far_med'):
              rep_weight_vec_dict = select_farthest(next_weight_vec_dict, select_num, \
                                                    num_weights)
              centroid_weight_vec_dict = select_medoid(next_weight_vec_dict, 1, \
                                                       num_weights)
              assert len(centroid_weight_vec_dict) == 1

              # Add centroids to representative weight vectors
              #
              for weight_vector_tuple in centroid_weight_vec_dict:
                rep_weight_vec_dict[weight_vector_tuple] = \
                                          centroid_weight_vec_dict[weight_vector_tuple]

            elif (select_method == 'dense'):
              rep_weight_vec_dict = select_densest(next_weight_vec_dict, select_num, \
                                                   num_weights)

            elif (select_method == 'aggl'):
              rep_weight_vec_dict = select_agglomerative(next_weight_vec_dict,
                                                         select_num, \
                                                         num_weights)

            else:  # Random
              rep_weight_vec_dict = select_random(next_weight_vec_dict, select_num)

          sel_time = time.time() - start_time
          loop_sel_time_list.append(sel_time)

          # Step 2c: Give selected weight vectors to oracle for 'manual' classification
          #
          start_time = time.time()

          match_dict, non_match_dict, purity, entropy = \
                                               oracle(rep_weight_vec_dict, oracle_acc)
          cluster_pureness_list.append(purity)
          cluster_entropy_list.append(entropy)

          assert len(match_dict) + len(non_match_dict) == len(rep_weight_vec_dict)

          # Calculate a new estimate for match proportion based on size of manually
          # classified weight vectors
          #
          cluster_est_proportion = float(len(match_dict)) / len(rep_weight_vec_dict)

          # Update the cache with the 'manually' classified weight vectors
          #
          for weight_vector_tuple in rep_weight_vec_dict:
            oracle_class_cache_set.add(weight_vector_tuple)

          # Add the manually classified weight vectors into the final training sets
          # and remove from current cluster as well as from the original weight vector
          # dictionary
          #
          for weight_vector_tuple in match_dict:
            final_match_weight_vector_dict[weight_vector_tuple] = \
                                                        match_dict[weight_vector_tuple]
            del next_weight_vec_dict[weight_vector_tuple]
            if (next_weight_vec_dict != weighted_unique_weight_vec_dict):
              del weighted_unique_weight_vec_dict[weight_vector_tuple]

          for weight_vector_tuple in non_match_dict:
            final_non_match_weight_vector_dict[weight_vector_tuple] = \
                                                    non_match_dict[weight_vector_tuple]
            del next_weight_vec_dict[weight_vector_tuple]
            if (next_weight_vec_dict != weighted_unique_weight_vec_dict):
              del weighted_unique_weight_vec_dict[weight_vector_tuple]

          print 'Deleted %d weight vectors (classified by oracle) from cluster' % \
                (len(match_dict)+len(non_match_dict))
          print

          oracle_time = time.time() - start_time
          loop_oracle_time_list.append(oracle_time)

          # Step 2f: Decide if cluster needs/can be split further or not
          # Stopping criteria:
          # 1) Cluster is pure enough and not too large (<= max_cluster_size)
          #    -> Add to final training data
          # 2) Cluster is too small for further splitting -> Do not use for training
          # 3) No more budget left for future 'manual' oracle classification

          # If the cluster is pure enough and not too large then add all its weight
          # vectors to the final dictionaries of training data, and remove them from
          # the original weight vector dictionary
          #
          if ((cluster_size <= max_cluster_size) and (purity >= min_purity)):
            print 'Cluster is pure enough and not too large, add its ' + \
                  '%d weight vectors to:' % (cluster_size)

            num_clusters_used += 1
            cluster_use_pure_list.append(purity)

            if (len(match_dict) > len(non_match_dict)):  # The cluster contains matches
              print '  Match training set'
              for weight_vector_tuple in next_weight_vec_dict.keys():
                final_match_weight_vector_dict[weight_vector_tuple] = \
                                             next_weight_vec_dict[weight_vector_tuple]
                del weighted_unique_weight_vec_dict[weight_vector_tuple]
            else:  # The cluster contains non-matches
              print '  Non-match training set'
              for weight_vector_tuple in next_weight_vec_dict.keys():
                final_non_match_weight_vector_dict[weight_vector_tuple] = \
                                             next_weight_vec_dict[weight_vector_tuple]
                del weighted_unique_weight_vec_dict[weight_vector_tuple]
            print

          # Check if the cluster can be split further
          #
          elif (cluster_size <= min_cluster_size):
            print 'Cluster is too small for further splitting, but not pure enough ' \
                  + 'for further splitting, so do not add to training data'
            print

          else:  # The cluster is too large or not pure enough and it can be split
            print 'Cluster not pure enough or too large, and can be split further'
            print

            # Check if we still have manual classifications left
            #
            if (len(oracle_class_cache_set) >= budget_num_class):
              print 'Reached end of manual classification budget'
              print
              break  # Leave while loop, no more budget to do manual classification

            # Step 2d: Split the cluster using a binary classifier
            #
            if ((len(match_dict) > 0) and (len(non_match_dict) > 0)):
              # We need training weight vectors in both classes

              start_time = time.time()

              if (split_classifier == 'knn'):
                class_match_dict, class_non_match_dict = \
                                  knn_split_classifier(next_weight_vec_dict, KNN_K, \
                                                       match_dict, non_match_dict)
              elif (split_classifier == 'dtree'):
                class_match_dict, class_non_match_dict = \
                           dtree_split_classifier(next_weight_vec_dict, match_dict, \
                                                  non_match_dict, num_weights)
              else:
                class_match_dict, class_non_match_dict = \
                           svm_split_classifier(next_weight_vec_dict, match_dict, \
                                                non_match_dict, num_weights)

              class_time = time.time() - start_time
              loop_class_time_list.append(class_time)

              # First check that both sub-clusters contain weight vectors, if not (i.e.
              # if all weight vectors are in one sub-cluster) then we cannot add as we
              # would add the same cluster as we had before -> endless loop
              #
              if ((len(class_match_dict) > 0) and (len(class_non_match_dict) > 0)):

                # Only add to queue if a sub-cluster contains enough weight vectors
                # (i.e. more than needed in the selection process)
                #
                if (len(class_match_dict) > 0):
                  select_num = get_sample_size(len(class_match_dict), \
                                               cluster_est_proportion, \
                                               sample_error)
                  if (len(class_match_dict) >= select_num):
                    cluster_queue.append((class_match_dict, purity, entropy,
                                          len(class_match_dict),
                                          cluster_est_proportion))
                  else:
                    print '  Match cluster not large enough for required sample size'

                if (len(class_non_match_dict) > 0):
                  select_num = get_sample_size(len(class_non_match_dict), \
                                               cluster_est_proportion, \
                                               sample_error)
                  if (len(class_non_match_dict) > select_num):
                    cluster_queue.append((class_non_match_dict, purity, entropy,
                                          len(class_non_match_dict),
                                          cluster_est_proportion))
                  else:
                    print '  Non-match cluster not large enough for required ' + \
                          'sample size'

              else:
                # cluster not used further on in training

                # This case happens if we have a pure cluster that is too large
                # It will likely not be split further - so what can we do here?
                # Split randomly into 2? Do fathest first with k=2, then split into 2
                # acording to nearest?
                pass  # TODO **********************************************************

            else:  # We have training weight vectors in one class only
              pass # TODO - what here?

        dirDest = "../csv/conjuntosDS/treinoTeste/"
#         dirDest = "C:/Users/Diego/Documents/NetBeansProjects/Master-SKYAM/AS/src/csv/conjuntosDS/treinoTeste/"
#         dirDest = "../../Documents/NetBeansProjects/Master-SKYAM/AS/src/csv/conjuntosDS/treinoTeste/"
        
        
#         geraTrainSet(weight_vector_dict_orig, dirDest, 'train' + '(' + int(algUtl) + ')' + num + '.csv')    

#         geraTestSet(weight_vector_dict_orig, dirDest, 'test' + '(' + int(algUtl) + ')' + num + '.csv')
        
#         print 'Number of manual oracle classifications done: %d (out of total ' % \
#       (len(oracle_class_cache_set)) + 'budget of %d)' % (budget_num_class)
#         print ''
        
        abordagem = 'DS'
        #print 'abordagem é %s' %(abordagem)
        
        #algUtl = linhaAtual['algoritmosUtilizados'].item()
        iteracao = 1
        inspecoesManuais = len(oracle_class_cache_set)
        print linhaAtual['da'].item()
        
        da = linhaAtual['da'].item()
        dm = len(final_match_weight_vector_dict)
        ndm = len(final_non_match_weight_vector_dict)

        tp = float(linhaAtual['tp'].item() + dm)
        fp = float(linhaAtual['fp'].item())
        tn = float(linhaAtual['tn'].item())# + ndm) #Retirado
        fn = float(linhaAtual['fn'].item() - dm) #Adicionado
        
#         print 'tp'
#         print type(tp)
#         print tp
#         print 'fp'
#         print type(fp)
#         print fp
#         print 'tn'
#         print type(tn)
#         print tn
#         print 'fn'
#         print type(fn)
#         print fn
        
        precision = tp/(tp+fp)
#         print type(precisao)
#         print precisao
#         print 'Precisão:'
#         print type(precision)
#         print precision
        recall = tp/(tp+fn)
        fmeasure = 2*((precision*recall)/(precision+recall))
        
        
        
        #Adicionando valor à última linha
        estatisticas.loc[(algUtl, etapa, permutacao), ['abordagem', 'iteracao', 'inspecoesManuais',
           'precision', 'recall', 'f-measure', 'da', 'dm', 'ndm', 'tp',
           'fp', 'tn', 'fn'] ] = ([abordagem, iteracao, inspecoesManuais,
           precision, recall, fmeasure, da, dm, ndm, tp, fp, tn, fn])
        
        dirDest = "../csv/conjuntosDS/treinoTeste/"
#         dirDest = "../../Documents/NetBeansProjects/Master-SKYAM/AS/src/csv/conjuntosDS/treinoTeste/"
#         dirDest = "./arqResult/csv/conjuntosDS/conjuntosDiverg/treinoTeste/"
        
        #algUtl = str(algUtl).replace('.0','')
        algUtl = str(algUtl)
        
        geraTrainSet(weight_vector_dict_orig, dirDest, 'train' + '(' + algUtl + ')' + num + '.csv')    

        geraTestSet(weight_vector_dict_orig, dirDest, 'test' + '(' + algUtl + ')' + num + '.csv')

        #Para voltar o dataframe ao normal (Depois organizar as colunas)

estatisticas = estatisticas.reset_index(level=['algoritmosUtilizados', 'etapa', 'permutacao'])

estatisticas = estatisticas[['abordagem', 'etapa', 'algoritmosUtilizados', 'permutacao', 'iteracao', 'inspecoesManuais', 'precision', 'recall', 'f-measure', 'da', 'dm', 'ndm', 'tp', 'fp', 'tn', 'fn']]

estatisticas[['algoritmosUtilizados', 'iteracao', 'inspecoesManuais', 'da', 'dm', 'ndm', 'tp', 'fp', 'tn', 'fn']] = \
estatisticas[['algoritmosUtilizados', 'iteracao', 'inspecoesManuais', 'da', 'dm', 'ndm', 'tp', 'fp', 'tn', 'fn']].astype(int)

# Diretório para Windows
dirEst = "../csv/"
# dirEst = "C:\Users\Diego\Documents\NetBeansProjects\Master-SKYAM\AS\src\csv\\"
# dirEst = "../../Documents/NetBeansProjects/Master-SKYAM/AS/src/csv/"


# Diretório para Linux
# dirEst = "./arqResult/csv/"

estatisticas.to_csv(dirEst+'estatisticaInicialDS2.csv', sep=';', index=False)
Index([u'abordagem', u'iteracao', u'inspecoesManuais', u'precision', u'recall',
       u'f-measure', u'da', u'dm', u'ndm', u'tp', u'fp', u'tn', u'fn'],
      dtype='object')
estatisticas.shape
(3000, 13)
Analisando o arquivo: diverg(10)598_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                 DS
iteracao                   0
inspecoesManuais           0
precision           0.981818
recall              0.180602
f-measure           0.305085
da                        55
dm                         0
ndm                        0
tp                        54
fp                         1
tn                  47652903
fn                       245
Name: (10, 1 - acm diverg, 598), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)598_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 295
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 295 weight vectors
  Containing 192 true matches and 103 true non-matches
    (65.08% true matches)
  Identified 265 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   250  (94.34%)
          2 :    12  (4.53%)
          3 :     2  (0.75%)
         15 :     1  (0.38%)

Identified 1 non-pure unique weight vectors (from 265 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 100

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 294
  Number of unique weight vectors: 265

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (265, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 265 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 70

Perform initial selection using "far" method

Farthest first selection of 70 weight vectors from 265 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 37 matches and 33 non-matches
    Purity of oracle classification:  0.529
    Entropy of oracle classification: 0.998
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  33
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 195 weight vectors
  Based on 37 matches and 33 non-matches
  Classified 134 matches and 61 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 70
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (134, 0.5285714285714286, 0.9976432959863937, 0.5285714285714286)
    (61, 0.5285714285714286, 0.9976432959863937, 0.5285714285714286)

Current size of match and non-match training data sets: 37 / 33

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 134 weight vectors
- Estimated match proportion 0.529

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 134 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 50 matches and 6 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.491
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55
Analisando o arquivo: diverg(20)520_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 520), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)520_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)641_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 641), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)641_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
/home/diego/anaconda3/envs/python2/lib/python2.7/site-packages/ipykernel/kernelbase.py:399: PerformanceWarning: indexing past lexsort depth may impact performance.
  user_expressions, allow_stdin)
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(15)816_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (15, 1 - acm diverg, 816), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)816_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 723
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 723 weight vectors
  Containing 170 true matches and 553 true non-matches
    (23.51% true matches)
  Identified 686 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   655  (95.48%)
          2 :    28  (4.08%)
          3 :     2  (0.29%)
          6 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 686 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 153
     0.000 : 533

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 723
  Number of unique weight vectors: 686

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (686, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 686 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 686 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 602 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 112 matches and 490 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (490, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 490 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 490 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.714, 0.727, 0.750, 0.294, 0.833] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 10 matches and 60 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(15)820_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 820), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)820_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 946
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 946 weight vectors
  Containing 219 true matches and 727 true non-matches
    (23.15% true matches)
  Identified 891 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   855  (95.96%)
          2 :    33  (3.70%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 891 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 706

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 945
  Number of unique weight vectors: 891

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (891, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 891 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 891 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 805 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 130 matches and 675 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (675, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 130 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)879_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 879), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)879_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 445
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 445 weight vectors
  Containing 196 true matches and 249 true non-matches
    (44.04% true matches)
  Identified 421 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   404  (95.96%)
          2 :    14  (3.33%)
          3 :     2  (0.48%)
          7 :     1  (0.24%)

Identified 0 non-pure unique weight vectors (from 421 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 247

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 445
  Number of unique weight vectors: 421

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (421, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 421 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 421 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 37 matches and 41 non-matches
    Purity of oracle classification:  0.526
    Entropy of oracle classification: 0.998
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 343 weight vectors
  Based on 37 matches and 41 non-matches
  Classified 278 matches and 65 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (278, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)
    (65, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)

Current size of match and non-match training data sets: 37 / 41

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 278 weight vectors
- Estimated match proportion 0.474

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 278 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.933, 1.000, 1.000, 1.000] (True)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 41 matches and 30 non-matches
    Purity of oracle classification:  0.577
    Entropy of oracle classification: 0.983
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  30
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)660_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 660), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)660_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 748
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 748 weight vectors
  Containing 196 true matches and 552 true non-matches
    (26.20% true matches)
  Identified 706 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   671  (95.04%)
          2 :    32  (4.53%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 706 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 532

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 748
  Number of unique weight vectors: 706

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (706, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 706 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 622 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 284 matches and 338 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (284, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (338, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 338 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 338 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.367, 0.667, 0.583, 0.625, 0.316] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.438, 0.500, 0.467, 0.529, 0.611] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.818, 0.727, 0.438, 0.375, 0.400] (False)
    [0.857, 0.000, 0.500, 0.389, 0.235, 0.045, 0.526] (False)
    [1.000, 0.000, 0.476, 0.179, 0.500, 0.412, 0.357] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.833, 0.571, 0.727, 0.647, 0.857] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.583, 0.875, 0.727, 0.833, 0.643] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)818_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.980198
recall                 0.331104
f-measure                 0.495
da                          101
dm                            0
ndm                           0
tp                           99
fp                            2
tn                  4.76529e+07
fn                          200
Name: (10, 1 - acm diverg, 818), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)818_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 265
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 265 weight vectors
  Containing 152 true matches and 113 true non-matches
    (57.36% true matches)
  Identified 250 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   239  (95.60%)
          2 :     8  (3.20%)
          3 :     2  (0.80%)
          4 :     1  (0.40%)

Identified 0 non-pure unique weight vectors (from 250 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 139
     0.000 : 111

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 265
  Number of unique weight vectors: 250

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (250, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 250 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 69

Perform initial selection using "far" method

Farthest first selection of 69 weight vectors from 250 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 35 matches and 34 non-matches
    Purity of oracle classification:  0.507
    Entropy of oracle classification: 1.000
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  34
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 181 weight vectors
  Based on 35 matches and 34 non-matches
  Classified 115 matches and 66 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 69
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (115, 0.5072463768115942, 0.9998484829291058, 0.5072463768115942)
    (66, 0.5072463768115942, 0.9998484829291058, 0.5072463768115942)

Current size of match and non-match training data sets: 35 / 34

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 115 weight vectors
- Estimated match proportion 0.507

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 115 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 42 matches and 11 non-matches
    Purity of oracle classification:  0.792
    Entropy of oracle classification: 0.737
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(15)729_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (15, 1 - acm diverg, 729), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)729_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1031
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1031 weight vectors
  Containing 203 true matches and 828 true non-matches
    (19.69% true matches)
  Identified 981 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   947  (96.53%)
          2 :    31  (3.16%)
          3 :     2  (0.20%)
         16 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 981 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 807

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1030
  Number of unique weight vectors: 981

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (981, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 981 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 981 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 894 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 101 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 101 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 45 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(10)318_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987952
recall                 0.274247
f-measure              0.429319
da                           83
dm                            0
ndm                           0
tp                           82
fp                            1
tn                  4.76529e+07
fn                          217
Name: (10, 1 - acm diverg, 318), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)318_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 504
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 504 weight vectors
  Containing 147 true matches and 357 true non-matches
    (29.17% true matches)
  Identified 488 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   481  (98.57%)
          2 :     4  (0.82%)
          3 :     2  (0.41%)
          9 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 488 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 131
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 356

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 495
  Number of unique weight vectors: 487

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (487, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 487 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 487 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.714, 0.353, 0.583, 0.571] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 26 matches and 54 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 407 weight vectors
  Based on 26 matches and 54 non-matches
  Classified 110 matches and 297 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (110, 0.675, 0.9097361225311662, 0.325)
    (297, 0.675, 0.9097361225311662, 0.325)

Current size of match and non-match training data sets: 26 / 54

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 110 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 110 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 42 matches and 6 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

83.0
Analisando o arquivo: diverg(10)22_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 22), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)22_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 596
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 596 weight vectors
  Containing 196 true matches and 400 true non-matches
    (32.89% true matches)
  Identified 547 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   513  (93.78%)
          2 :    31  (5.67%)
          3 :     2  (0.37%)
         15 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 547 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 379

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 595
  Number of unique weight vectors: 547

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (547, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 547 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 547 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 31 matches and 50 non-matches
    Purity of oracle classification:  0.617
    Entropy of oracle classification: 0.960
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 466 weight vectors
  Based on 31 matches and 50 non-matches
  Classified 159 matches and 307 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)
    (307, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)

Current size of match and non-match training data sets: 31 / 50

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 159 weight vectors
- Estimated match proportion 0.383

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 159 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 47 matches and 11 non-matches
    Purity of oracle classification:  0.810
    Entropy of oracle classification: 0.701
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(10)439_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 439), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)439_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 544
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 544 weight vectors
  Containing 185 true matches and 359 true non-matches
    (34.01% true matches)
  Identified 511 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   494  (96.67%)
          2 :    14  (2.74%)
          3 :     2  (0.39%)
         16 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 511 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 154
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 356

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 543
  Number of unique weight vectors: 511

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (511, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 511 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 511 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 31 matches and 50 non-matches
    Purity of oracle classification:  0.617
    Entropy of oracle classification: 0.960
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 430 weight vectors
  Based on 31 matches and 50 non-matches
  Classified 127 matches and 303 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (127, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)
    (303, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)

Current size of match and non-match training data sets: 31 / 50

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 303 weight vectors
- Estimated match proportion 0.383

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 303 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.778, 0.429, 0.571, 0.750, 0.600] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.750, 0.750, 0.688, 0.500, 0.800] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.625, 0.526, 0.300, 0.778, 0.609] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.826, 0.286, 0.857, 0.643] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.611, 0.000, 0.800, 0.684, 0.500, 0.778, 0.609] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.714, 0.500, 0.500, 0.412, 0.571] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 0 matches and 70 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)951_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.982143
recall                 0.183946
f-measure              0.309859
da                           56
dm                            0
ndm                           0
tp                           55
fp                            1
tn                  4.76529e+07
fn                          244
Name: (10, 1 - acm diverg, 951), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)951_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 668
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 668 weight vectors
  Containing 201 true matches and 467 true non-matches
    (30.09% true matches)
  Identified 617 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   583  (94.49%)
          2 :    31  (5.02%)
          3 :     2  (0.32%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 617 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 446

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 667
  Number of unique weight vectors: 617

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (617, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 617 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 617 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 24 matches and 59 non-matches
    Purity of oracle classification:  0.711
    Entropy of oracle classification: 0.868
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 534 weight vectors
  Based on 24 matches and 59 non-matches
  Classified 98 matches and 436 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7108433734939759, 0.8676293117125106, 0.2891566265060241)
    (436, 0.7108433734939759, 0.8676293117125106, 0.2891566265060241)

Current size of match and non-match training data sets: 24 / 59

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 98 weight vectors
- Estimated match proportion 0.289

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 98 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 44 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

56.0
Analisando o arquivo: diverg(10)537_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 537), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)537_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 663
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 663 weight vectors
  Containing 194 true matches and 469 true non-matches
    (29.26% true matches)
  Identified 642 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   628  (97.82%)
          2 :    11  (1.71%)
          3 :     2  (0.31%)
          7 :     1  (0.16%)

Identified 0 non-pure unique weight vectors (from 642 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 469

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 663
  Number of unique weight vectors: 642

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (642, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 642 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 642 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 559 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 136 matches and 423 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (423, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 423 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 423 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.783, 0.583, 0.435, 0.765, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 9 matches and 62 non-matches
    Purity of oracle classification:  0.873
    Entropy of oracle classification: 0.548
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)759_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 759), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)759_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 750
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 750 weight vectors
  Containing 222 true matches and 528 true non-matches
    (29.60% true matches)
  Identified 714 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   695  (97.34%)
          2 :    16  (2.24%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 714 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 525

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 749
  Number of unique weight vectors: 714

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (714, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 714 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 714 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 630 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 133 matches and 497 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (497, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 497 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 497 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 13 matches and 60 non-matches
    Purity of oracle classification:  0.822
    Entropy of oracle classification: 0.676
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)787_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 787), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)787_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 118 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 118 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)51_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 51), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)51_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 824
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 824 weight vectors
  Containing 209 true matches and 615 true non-matches
    (25.36% true matches)
  Identified 777 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   742  (95.50%)
          2 :    32  (4.12%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 777 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 594

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 823
  Number of unique weight vectors: 777

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (777, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 777 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 777 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 692 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 153 matches and 539 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (539, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 539 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 539 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.791, 1.000, 0.275, 0.269, 0.192, 0.084, 0.200] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 2 matches and 72 non-matches
    Purity of oracle classification:  0.973
    Entropy of oracle classification: 0.179
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)188_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (20, 1 - acm diverg, 188), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)188_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 920
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 920 weight vectors
  Containing 215 true matches and 705 true non-matches
    (23.37% true matches)
  Identified 868 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   832  (95.85%)
          2 :    33  (3.80%)
          3 :     2  (0.23%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 868 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 684

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 919
  Number of unique weight vectors: 868

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (868, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 868 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 868 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 782 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 158 matches and 624 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (624, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 624 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 624 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 2 matches and 72 non-matches
    Purity of oracle classification:  0.973
    Entropy of oracle classification: 0.179
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)91_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 91), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)91_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)648_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 648), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)648_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 148 matches and 784 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (784, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 784 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 784 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 8 matches and 66 non-matches
    Purity of oracle classification:  0.892
    Entropy of oracle classification: 0.494
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)257_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 257), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)257_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 298
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 298 weight vectors
  Containing 189 true matches and 109 true non-matches
    (63.42% true matches)
  Identified 274 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   261  (95.26%)
          2 :    10  (3.65%)
          3 :     2  (0.73%)
         11 :     1  (0.36%)

Identified 1 non-pure unique weight vectors (from 274 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 108

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 297
  Number of unique weight vectors: 274

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (274, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 274 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Perform initial selection using "far" method

Farthest first selection of 71 weight vectors from 274 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 36 matches and 35 non-matches
    Purity of oracle classification:  0.507
    Entropy of oracle classification: 1.000
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  35
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 203 weight vectors
  Based on 36 matches and 35 non-matches
  Classified 141 matches and 62 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 71
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.5070422535211268, 0.9998568991526107, 0.5070422535211268)
    (62, 0.5070422535211268, 0.9998568991526107, 0.5070422535211268)

Current size of match and non-match training data sets: 36 / 35

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 62 weight vectors
- Estimated match proportion 0.507

Sample size for this cluster: 38

Farthest first selection of 38 weight vectors from 62 vectors
  The selected farthest weight vectors are:
    [0.530, 1.000, 0.159, 0.086, 0.182, 0.159, 0.163] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.800, 1.000, 0.242, 0.121, 0.200, 0.171, 0.000] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 38 weight vectors
  The oracle will correctly classify 38 weight vectors and wrongly classify 0
  Classified 0 matches and 38 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 38 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)880_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 880), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)880_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 810
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 810 weight vectors
  Containing 223 true matches and 587 true non-matches
    (27.53% true matches)
  Identified 756 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   719  (95.11%)
          2 :    34  (4.50%)
          3 :     2  (0.26%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 756 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 566

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 809
  Number of unique weight vectors: 756

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (756, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 756 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 671 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 94 matches and 577 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (577, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 577 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 577 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 20 matches and 53 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      20
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)838_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 838), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)838_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 733
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 733 weight vectors
  Containing 198 true matches and 535 true non-matches
    (27.01% true matches)
  Identified 691 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   656  (94.93%)
          2 :    32  (4.63%)
          3 :     2  (0.29%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 691 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 515

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 733
  Number of unique weight vectors: 691

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (691, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 691 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 691 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 607 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 136 matches and 471 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (471, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 136 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 49 matches and 2 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.239
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)526_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.980198
recall                 0.331104
f-measure                 0.495
da                          101
dm                            0
ndm                           0
tp                           99
fp                            2
tn                  4.76529e+07
fn                          200
Name: (10, 1 - acm diverg, 526), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)526_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 248
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 248 weight vectors
  Containing 147 true matches and 101 true non-matches
    (59.27% true matches)
  Identified 233 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   222  (95.28%)
          2 :     8  (3.43%)
          3 :     2  (0.86%)
          4 :     1  (0.43%)

Identified 0 non-pure unique weight vectors (from 233 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 134
     0.000 : 99

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 248
  Number of unique weight vectors: 233

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (233, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 233 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 68

Perform initial selection using "far" method

Farthest first selection of 68 weight vectors from 233 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.344, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 35 matches and 33 non-matches
    Purity of oracle classification:  0.515
    Entropy of oracle classification: 0.999
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  33
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 165 weight vectors
  Based on 35 matches and 33 non-matches
  Classified 105 matches and 60 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 68
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (105, 0.5147058823529411, 0.9993759069576514, 0.5147058823529411)
    (60, 0.5147058823529411, 0.9993759069576514, 0.5147058823529411)

Current size of match and non-match training data sets: 35 / 33

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 60 weight vectors
- Estimated match proportion 0.515

Sample size for this cluster: 37

Farthest first selection of 37 weight vectors from 60 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)

Perform oracle with 100.00 accuracy on 37 weight vectors
  The oracle will correctly classify 37 weight vectors and wrongly classify 0
  Classified 0 matches and 37 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 37 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(10)931_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990099
recall                 0.334448
f-measure                   0.5
da                          101
dm                            0
ndm                           0
tp                          100
fp                            1
tn                  4.76529e+07
fn                          199
Name: (10, 1 - acm diverg, 931), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)931_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 999
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 999 weight vectors
  Containing 164 true matches and 835 true non-matches
    (16.42% true matches)
  Identified 960 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   931  (96.98%)
          2 :    26  (2.71%)
          3 :     2  (0.21%)
         10 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 960 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 145
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 814

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 998
  Number of unique weight vectors: 960

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (960, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 960 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 960 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 873 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 84 matches and 789 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (84, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (789, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 789 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 789 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 14 matches and 56 non-matches
    Purity of oracle classification:  0.800
    Entropy of oracle classification: 0.722
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(15)997_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 997), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)997_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 665
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 665 weight vectors
  Containing 217 true matches and 448 true non-matches
    (32.63% true matches)
  Identified 628 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   610  (97.13%)
          2 :    15  (2.39%)
          3 :     2  (0.32%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 628 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 445

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 664
  Number of unique weight vectors: 628

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (628, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 628 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 628 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 545 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 133 matches and 412 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (412, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 133 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 133 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)135_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 135), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)135_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 377
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 377 weight vectors
  Containing 206 true matches and 171 true non-matches
    (54.64% true matches)
  Identified 343 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   327  (95.34%)
          2 :    13  (3.79%)
          3 :     2  (0.58%)
         18 :     1  (0.29%)

Identified 1 non-pure unique weight vectors (from 343 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 168

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 376
  Number of unique weight vectors: 343

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (343, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 343 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 343 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 30 matches and 45 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 268 weight vectors
  Based on 30 matches and 45 non-matches
  Classified 145 matches and 123 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.6, 0.9709505944546686, 0.4)
    (123, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 30 / 45

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 123 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 3 matches and 50 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)95_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 95), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)95_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 788
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 788 weight vectors
  Containing 224 true matches and 564 true non-matches
    (28.43% true matches)
  Identified 749 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   730  (97.46%)
          2 :    16  (2.14%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 749 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 561

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 787
  Number of unique weight vectors: 749

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (749, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 749 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 749 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 34 matches and 51 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 664 weight vectors
  Based on 34 matches and 51 non-matches
  Classified 153 matches and 511 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6, 0.9709505944546686, 0.4)
    (511, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 34 / 51

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 511 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 78

Farthest first selection of 78 weight vectors from 511 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 4 matches and 74 non-matches
    Purity of oracle classification:  0.949
    Entropy of oracle classification: 0.292
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)562_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 562), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)562_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1073
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1073 weight vectors
  Containing 226 true matches and 847 true non-matches
    (21.06% true matches)
  Identified 1016 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   979  (96.36%)
          2 :    34  (3.35%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1016 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 826

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1072
  Number of unique weight vectors: 1016

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1016, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1016 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1016 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 929 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 158 matches and 771 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (771, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 158 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 158 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)860_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 860), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)860_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 789
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 789 weight vectors
  Containing 225 true matches and 564 true non-matches
    (28.52% true matches)
  Identified 750 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   731  (97.47%)
          2 :    16  (2.13%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 750 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 561

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 788
  Number of unique weight vectors: 750

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (750, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 750 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 750 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 34 matches and 51 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 665 weight vectors
  Based on 34 matches and 51 non-matches
  Classified 153 matches and 512 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6, 0.9709505944546686, 0.4)
    (512, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 34 / 51

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 512 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 78

Farthest first selection of 78 weight vectors from 512 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 4 matches and 74 non-matches
    Purity of oracle classification:  0.949
    Entropy of oracle classification: 0.292
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)718_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 718), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)718_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 895
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 895 weight vectors
  Containing 198 true matches and 697 true non-matches
    (22.12% true matches)
  Identified 850 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   816  (96.00%)
          2 :    31  (3.65%)
          3 :     2  (0.24%)
         11 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 850 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 676

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 894
  Number of unique weight vectors: 850

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (850, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 850 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 850 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 31 matches and 55 non-matches
    Purity of oracle classification:  0.640
    Entropy of oracle classification: 0.943
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 764 weight vectors
  Based on 31 matches and 55 non-matches
  Classified 194 matches and 570 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (194, 0.6395348837209303, 0.9430685934712908, 0.36046511627906974)
    (570, 0.6395348837209303, 0.9430685934712908, 0.36046511627906974)

Current size of match and non-match training data sets: 31 / 55

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 194 weight vectors
- Estimated match proportion 0.360

Sample size for this cluster: 61

Farthest first selection of 61 weight vectors from 194 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 61 weight vectors
  The oracle will correctly classify 61 weight vectors and wrongly classify 0
  Classified 41 matches and 20 non-matches
    Purity of oracle classification:  0.672
    Entropy of oracle classification: 0.913
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  20
    Number of false non-matches: 0

Deleted 61 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)221_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 221), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)221_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 201 true matches and 752 true non-matches
    (21.09% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.26%)
          2 :    31  (3.41%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 115 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (115, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)378_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 378), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)378_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)79_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 79), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)79_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)966_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 966), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)966_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)435_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 435), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)435_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)659_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 659), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)659_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 749
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 749 weight vectors
  Containing 194 true matches and 555 true non-matches
    (25.90% true matches)
  Identified 707 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   672  (95.05%)
          2 :    32  (4.53%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 707 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.000 : 535

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 749
  Number of unique weight vectors: 707

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (707, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 707 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 707 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 623 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 286 matches and 337 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (286, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (337, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 337 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 337 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.538, 0.500, 0.818, 0.789, 0.750] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.750, 0.778, 0.471, 0.727, 0.684] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.833, 0.571, 0.727, 0.647, 0.857] (False)
    [1.000, 0.000, 0.857, 0.286, 0.500, 0.643, 0.600] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [0.800, 0.000, 0.625, 0.571, 0.467, 0.474, 0.667] (False)
    [1.000, 0.000, 0.423, 0.478, 0.500, 0.813, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.583, 0.389, 0.471, 0.545, 0.474] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.385, 0.391, 0.667, 0.579, 0.824] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.000, 0.700, 0.818, 0.444, 0.619] (False)
    [1.000, 0.000, 0.857, 0.444, 0.556, 0.235, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [1.000, 0.000, 0.333, 0.750, 0.667, 0.667, 0.571] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.067, 0.550, 0.818, 0.727, 0.762] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(20)213_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 213), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)213_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)169_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 169), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)169_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 983
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 983 weight vectors
  Containing 198 true matches and 785 true non-matches
    (20.14% true matches)
  Identified 941 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   906  (96.28%)
          2 :    32  (3.40%)
          3 :     2  (0.21%)
          7 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 941 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 765

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 983
  Number of unique weight vectors: 941

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (941, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 941 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 941 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 29 matches and 58 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 854 weight vectors
  Based on 29 matches and 58 non-matches
  Classified 144 matches and 710 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (710, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 29 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 710 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 710 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 3 matches and 73 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.240
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)906_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 906), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)906_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 950 non-matches

46.0
Analisando o arquivo: diverg(20)604_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 604), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)604_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 29 matches and 59 non-matches
    Purity of oracle classification:  0.670
    Entropy of oracle classification: 0.914
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 29 matches and 59 non-matches
  Classified 159 matches and 779 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)
    (779, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)

Current size of match and non-match training data sets: 29 / 59

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 779 weight vectors
- Estimated match proportion 0.330

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 779 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.667, 0.500, 0.647, 0.556, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.750, 0.429, 0.526, 0.500, 0.846] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.462, 0.889, 0.455, 0.211, 0.375] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.412, 0.318, 0.421] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.233, 0.545, 0.714, 0.455, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 2 matches and 74 non-matches
    Purity of oracle classification:  0.974
    Entropy of oracle classification: 0.176
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)205_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 205), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)205_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 69 matches and 842 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (69, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (842, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 69 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 37

Farthest first selection of 37 weight vectors from 69 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.933, 1.000, 0.952, 1.000, 1.000, 0.944, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.833, 1.000, 1.000, 0.935] (True)
    [1.000, 1.000, 1.000, 1.000, 0.950, 0.923, 0.941] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.958, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)

Perform oracle with 100.00 accuracy on 37 weight vectors
  The oracle will correctly classify 37 weight vectors and wrongly classify 0
  Classified 37 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 37 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)302_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 302), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)302_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1046
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1046 weight vectors
  Containing 225 true matches and 821 true non-matches
    (21.51% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   952  (96.26%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 800

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1045
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 32 matches and 55 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 32 matches and 55 non-matches
  Classified 330 matches and 572 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (330, 0.632183908045977, 0.9489804585630242, 0.367816091954023)
    (572, 0.632183908045977, 0.9489804585630242, 0.367816091954023)

Current size of match and non-match training data sets: 32 / 55

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 330 weight vectors
- Estimated match proportion 0.368

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 330 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.909, 1.000, 1.000, 1.000, 0.947] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 41 matches and 29 non-matches
    Purity of oracle classification:  0.586
    Entropy of oracle classification: 0.979
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  29
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)747_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 747), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)747_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 733
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 733 weight vectors
  Containing 210 true matches and 523 true non-matches
    (28.65% true matches)
  Identified 699 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   682  (97.57%)
          2 :    14  (2.00%)
          3 :     2  (0.29%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 699 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 520

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 732
  Number of unique weight vectors: 699

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (699, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 699 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 699 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 615 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 141 matches and 474 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (474, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 141 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 50 matches and 4 non-matches
    Purity of oracle classification:  0.926
    Entropy of oracle classification: 0.381
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)55_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 55), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)55_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 700
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 700 weight vectors
  Containing 214 true matches and 486 true non-matches
    (30.57% true matches)
  Identified 665 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   650  (97.74%)
          2 :    12  (1.80%)
          3 :     2  (0.30%)
         20 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 665 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 485

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 699
  Number of unique weight vectors: 665

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (665, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 665 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 665 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 581 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 314 matches and 267 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (314, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (267, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 314 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 314 vectors
  The selected farthest weight vectors are:
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.890, 1.000, 0.281, 0.136, 0.183, 0.250, 0.163] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 42 matches and 28 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)650_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 650), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)650_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 708
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 708 weight vectors
  Containing 196 true matches and 512 true non-matches
    (27.68% true matches)
  Identified 684 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   667  (97.51%)
          2 :    14  (2.05%)
          3 :     2  (0.29%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 684 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 510

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 708
  Number of unique weight vectors: 684

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (684, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 684 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 684 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 33 matches and 51 non-matches
    Purity of oracle classification:  0.607
    Entropy of oracle classification: 0.967
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 600 weight vectors
  Based on 33 matches and 51 non-matches
  Classified 139 matches and 461 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (139, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)
    (461, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)

Current size of match and non-match training data sets: 33 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.97
- Size 461 weight vectors
- Estimated match proportion 0.393

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 461 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 3 matches and 73 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.240
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)790_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981481
recall                 0.177258
f-measure              0.300283
da                           54
dm                            0
ndm                           0
tp                           53
fp                            1
tn                  4.76529e+07
fn                          246
Name: (10, 1 - acm diverg, 790), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)790_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 758
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 758 weight vectors
  Containing 208 true matches and 550 true non-matches
    (27.44% true matches)
  Identified 722 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   705  (97.65%)
          2 :    14  (1.94%)
          3 :     2  (0.28%)
         19 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 722 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 547

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 757
  Number of unique weight vectors: 722

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (722, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 722 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 722 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 33 matches and 52 non-matches
    Purity of oracle classification:  0.612
    Entropy of oracle classification: 0.964
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 637 weight vectors
  Based on 33 matches and 52 non-matches
  Classified 307 matches and 330 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (307, 0.611764705882353, 0.9636512739945753, 0.38823529411764707)
    (330, 0.611764705882353, 0.9636512739945753, 0.38823529411764707)

Current size of match and non-match training data sets: 33 / 52

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 307 weight vectors
- Estimated match proportion 0.388

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 307 vectors
  The selected farthest weight vectors are:
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.890, 1.000, 0.281, 0.136, 0.183, 0.250, 0.163] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 43 matches and 27 non-matches
    Purity of oracle classification:  0.614
    Entropy of oracle classification: 0.962
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  27
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

54.0
Analisando o arquivo: diverg(10)681_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 681), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)681_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 767
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 767 weight vectors
  Containing 196 true matches and 571 true non-matches
    (25.55% true matches)
  Identified 725 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   690  (95.17%)
          2 :    32  (4.41%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 725 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 551

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 767
  Number of unique weight vectors: 725

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (725, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 725 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 725 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 25 matches and 60 non-matches
    Purity of oracle classification:  0.706
    Entropy of oracle classification: 0.874
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 640 weight vectors
  Based on 25 matches and 60 non-matches
  Classified 98 matches and 542 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)
    (542, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)

Current size of match and non-match training data sets: 25 / 60

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 98 weight vectors
- Estimated match proportion 0.294

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 98 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.420, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 42 matches and 2 non-matches
    Purity of oracle classification:  0.955
    Entropy of oracle classification: 0.267
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)517_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 517), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)517_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1092
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1092 weight vectors
  Containing 226 true matches and 866 true non-matches
    (20.70% true matches)
  Identified 1035 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   998  (96.43%)
          2 :    34  (3.29%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1035 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1091
  Number of unique weight vectors: 1035

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1035, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1035 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1035 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 947 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 816 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (816, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 816 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 816 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 11 matches and 60 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.622
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)772_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 772), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)772_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 528
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 528 weight vectors
  Containing 224 true matches and 304 true non-matches
    (42.42% true matches)
  Identified 489 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   470  (96.11%)
          2 :    16  (3.27%)
          3 :     2  (0.41%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 489 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 301

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 527
  Number of unique weight vectors: 489

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (489, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 489 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 489 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 36 matches and 44 non-matches
    Purity of oracle classification:  0.550
    Entropy of oracle classification: 0.993
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 409 weight vectors
  Based on 36 matches and 44 non-matches
  Classified 208 matches and 201 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (208, 0.55, 0.9927744539878084, 0.45)
    (201, 0.55, 0.9927744539878084, 0.45)

Current size of match and non-match training data sets: 36 / 44

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 208 weight vectors
- Estimated match proportion 0.450

Sample size for this cluster: 65

Farthest first selection of 65 weight vectors from 208 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.261, 0.174, 0.148, 0.186, 0.148] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 1.000, 0.214, 0.184, 0.250, 0.267, 0.111] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 45 matches and 20 non-matches
    Purity of oracle classification:  0.692
    Entropy of oracle classification: 0.890
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  20
    Number of false non-matches: 0

Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)897_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 897), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)897_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1081
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1081 weight vectors
  Containing 209 true matches and 872 true non-matches
    (19.33% true matches)
  Identified 1034 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1034 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 851

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1080
  Number of unique weight vectors: 1034

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1034, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1034 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1034 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 946 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 845 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (845, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)932_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 932), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)932_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 830
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 830 weight vectors
  Containing 207 true matches and 623 true non-matches
    (24.94% true matches)
  Identified 783 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   748  (95.53%)
          2 :    32  (4.09%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 783 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 602

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 829
  Number of unique weight vectors: 783

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (783, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 783 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 783 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 24 matches and 61 non-matches
    Purity of oracle classification:  0.718
    Entropy of oracle classification: 0.859
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 698 weight vectors
  Based on 24 matches and 61 non-matches
  Classified 97 matches and 601 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (97, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)
    (601, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)

Current size of match and non-match training data sets: 24 / 61

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 601 weight vectors
- Estimated match proportion 0.282

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 601 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.423, 0.478, 0.500, 0.813, 0.545] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 17 matches and 52 non-matches
    Purity of oracle classification:  0.754
    Entropy of oracle classification: 0.805
    Number of true matches:      17
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)473_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 473), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)473_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 671
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 671 weight vectors
  Containing 199 true matches and 472 true non-matches
    (29.66% true matches)
  Identified 626 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   592  (94.57%)
          2 :    31  (4.95%)
          3 :     2  (0.32%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 626 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 451

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 670
  Number of unique weight vectors: 626

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (626, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 626 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 626 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 25 matches and 58 non-matches
    Purity of oracle classification:  0.699
    Entropy of oracle classification: 0.883
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 543 weight vectors
  Based on 25 matches and 58 non-matches
  Classified 142 matches and 401 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6987951807228916, 0.8827586787955115, 0.30120481927710846)
    (401, 0.6987951807228916, 0.8827586787955115, 0.30120481927710846)

Current size of match and non-match training data sets: 25 / 58

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 142 weight vectors
- Estimated match proportion 0.301

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)498_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984848
recall                 0.217391
f-measure              0.356164
da                           66
dm                            0
ndm                           0
tp                           65
fp                            1
tn                  4.76529e+07
fn                          234
Name: (10, 1 - acm diverg, 498), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)498_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 224
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 224 weight vectors
  Containing 176 true matches and 48 true non-matches
    (78.57% true matches)
  Identified 199 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   188  (94.47%)
          2 :     8  (4.02%)
          3 :     2  (1.01%)
         14 :     1  (0.50%)

Identified 1 non-pure unique weight vectors (from 199 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 151
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 47

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 223
  Number of unique weight vectors: 199

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (199, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 199 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 65

Perform initial selection using "far" method

Farthest first selection of 65 weight vectors from 199 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 36 matches and 29 non-matches
    Purity of oracle classification:  0.554
    Entropy of oracle classification: 0.992
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  29
    Number of false non-matches: 0

Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 134 weight vectors
  Based on 36 matches and 29 non-matches
  Classified 125 matches and 9 non-matches

  Non-match cluster not large enough for required sample size
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 1
  Number of manual oracle classifications performed: 65
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (125, 0.5538461538461539, 0.9916178297881032, 0.5538461538461539)

Current size of match and non-match training data sets: 36 / 29

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 125 weight vectors
- Estimated match proportion 0.554

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 125 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 45 matches and 9 non-matches
    Purity of oracle classification:  0.833
    Entropy of oracle classification: 0.650
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

66.0
Analisando o arquivo: diverg(10)82_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 82), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)82_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 770
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 770 weight vectors
  Containing 212 true matches and 558 true non-matches
    (27.53% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   683  (95.13%)
          2 :    32  (4.46%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 537

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 769
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 293 matches and 341 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (293, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (341, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 341 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 341 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.367, 0.667, 0.583, 0.625, 0.316] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.438, 0.500, 0.467, 0.529, 0.611] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [0.857, 0.000, 0.500, 0.389, 0.235, 0.045, 0.526] (False)
    [1.000, 0.000, 0.476, 0.179, 0.500, 0.412, 0.357] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.833, 0.571, 0.727, 0.647, 0.857] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.583, 0.875, 0.727, 0.833, 0.643] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)614_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 614), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)614_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 179 matches and 760 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (760, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 760 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)300_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 300), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)300_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 224 true matches and 575 true non-matches
    (28.04% true matches)
  Identified 760 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   741  (97.50%)
          2 :    16  (2.11%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 760 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 572

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 760

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (760, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 760 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 675 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 149 matches and 526 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (526, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 149 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 49 matches and 6 non-matches
    Purity of oracle classification:  0.891
    Entropy of oracle classification: 0.497
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)448_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 448), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)448_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 529
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 529 weight vectors
  Containing 225 true matches and 304 true non-matches
    (42.53% true matches)
  Identified 490 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   471  (96.12%)
          2 :    16  (3.27%)
          3 :     2  (0.41%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 490 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 301

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 528
  Number of unique weight vectors: 490

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (490, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 490 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 490 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 36 matches and 44 non-matches
    Purity of oracle classification:  0.550
    Entropy of oracle classification: 0.993
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 410 weight vectors
  Based on 36 matches and 44 non-matches
  Classified 173 matches and 237 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (173, 0.55, 0.9927744539878084, 0.45)
    (237, 0.55, 0.9927744539878084, 0.45)

Current size of match and non-match training data sets: 36 / 44

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 237 weight vectors
- Estimated match proportion 0.450

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 237 vectors
  The selected farthest weight vectors are:
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.818, 0.727, 0.438, 0.375, 0.400] (False)
    [1.000, 0.000, 0.800, 0.636, 0.563, 0.545, 0.722] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 1 matches and 67 non-matches
    Purity of oracle classification:  0.985
    Entropy of oracle classification: 0.111
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)65_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 65), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)65_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 845
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 845 weight vectors
  Containing 227 true matches and 618 true non-matches
    (26.86% true matches)
  Identified 788 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   751  (95.30%)
          2 :    34  (4.31%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 788 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 597

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 844
  Number of unique weight vectors: 788

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (788, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 788 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 788 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 703 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 162 matches and 541 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (541, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 162 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)954_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 954), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)954_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 861
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 861 weight vectors
  Containing 227 true matches and 634 true non-matches
    (26.36% true matches)
  Identified 804 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   767  (95.40%)
          2 :    34  (4.23%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 804 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 860
  Number of unique weight vectors: 804

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (804, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 804 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 804 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 718 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 565 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (565, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)310_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 310), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)310_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 902
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 902 weight vectors
  Containing 178 true matches and 724 true non-matches
    (19.73% true matches)
  Identified 863 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   833  (96.52%)
          2 :    27  (3.13%)
          3 :     2  (0.23%)
          9 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 863 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 703

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 893
  Number of unique weight vectors: 862

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (862, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 862 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 862 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 776 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 94 matches and 682 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (682, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 94 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 94 vectors
  The selected farthest weight vectors are:
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 43 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(15)63_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (15, 1 - acm diverg, 63), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)63_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 562
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 562 weight vectors
  Containing 173 true matches and 389 true non-matches
    (30.78% true matches)
  Identified 544 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   535  (98.35%)
          2 :     6  (1.10%)
          3 :     2  (0.37%)
          9 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 544 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 155
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 388

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 553
  Number of unique weight vectors: 543

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (543, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 543 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 543 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 26 matches and 55 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.905
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 462 weight vectors
  Based on 26 matches and 55 non-matches
  Classified 114 matches and 348 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6790123456790124, 0.9054522631867894, 0.32098765432098764)
    (348, 0.6790123456790124, 0.9054522631867894, 0.32098765432098764)

Current size of match and non-match training data sets: 26 / 55

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 348 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 348 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.455, 0.714, 0.429, 0.550, 0.647] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 11 matches and 56 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.644
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(10)711_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 711), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)711_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 645
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 645 weight vectors
  Containing 197 true matches and 448 true non-matches
    (30.54% true matches)
  Identified 590 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   554  (93.90%)
          2 :    33  (5.59%)
          3 :     2  (0.34%)
         19 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 590 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 427

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 644
  Number of unique weight vectors: 590

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (590, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 590 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 590 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 28 matches and 54 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 508 weight vectors
  Based on 28 matches and 54 non-matches
  Classified 181 matches and 327 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (181, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)
    (327, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)

Current size of match and non-match training data sets: 28 / 54

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 181 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 59

Farthest first selection of 59 weight vectors from 181 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.530, 1.000, 0.159, 0.086, 0.182, 0.159, 0.163] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 59 weight vectors
  The oracle will correctly classify 59 weight vectors and wrongly classify 0
  Classified 42 matches and 17 non-matches
    Purity of oracle classification:  0.712
    Entropy of oracle classification: 0.866
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  17
    Number of false non-matches: 0

Deleted 59 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)784_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990385
recall                 0.344482
f-measure              0.511166
da                          104
dm                            0
ndm                           0
tp                          103
fp                            1
tn                  4.76529e+07
fn                          196
Name: (10, 1 - acm diverg, 784), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)784_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 305
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 305 weight vectors
  Containing 146 true matches and 159 true non-matches
    (47.87% true matches)
  Identified 288 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   279  (96.88%)
          2 :     6  (2.08%)
          3 :     2  (0.69%)
          8 :     1  (0.35%)

Identified 1 non-pure unique weight vectors (from 288 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 131
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 156

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 297
  Number of unique weight vectors: 287

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (287, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 287 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 72

Perform initial selection using "far" method

Farthest first selection of 72 weight vectors from 287 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 34 matches and 38 non-matches
    Purity of oracle classification:  0.528
    Entropy of oracle classification: 0.998
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 215 weight vectors
  Based on 34 matches and 38 non-matches
  Classified 130 matches and 85 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 72
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.5277777777777778, 0.9977724720899821, 0.4722222222222222)
    (85, 0.5277777777777778, 0.9977724720899821, 0.4722222222222222)

Current size of match and non-match training data sets: 34 / 38

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 130 weight vectors
- Estimated match proportion 0.472

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.800, 1.000, 0.167, 0.180, 0.151, 0.147, 0.203] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 37 matches and 18 non-matches
    Purity of oracle classification:  0.673
    Entropy of oracle classification: 0.912
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  18
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

104.0
Analisando o arquivo: diverg(10)609_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.980583
recall                 0.337793
f-measure              0.502488
da                          103
dm                            0
ndm                           0
tp                          101
fp                            2
tn                  4.76529e+07
fn                          198
Name: (10, 1 - acm diverg, 609), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)609_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 237
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 237 weight vectors
  Containing 138 true matches and 99 true non-matches
    (58.23% true matches)
  Identified 222 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   211  (95.05%)
          2 :     8  (3.60%)
          3 :     2  (0.90%)
          4 :     1  (0.45%)

Identified 0 non-pure unique weight vectors (from 222 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 125
     0.000 : 97

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 237
  Number of unique weight vectors: 222

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (222, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 222 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 67

Perform initial selection using "far" method

Farthest first selection of 67 weight vectors from 222 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.344, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 30 matches and 37 non-matches
    Purity of oracle classification:  0.552
    Entropy of oracle classification: 0.992
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 155 weight vectors
  Based on 30 matches and 37 non-matches
  Classified 101 matches and 54 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 67
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.5522388059701493, 0.99211169200215, 0.44776119402985076)
    (54, 0.5522388059701493, 0.99211169200215, 0.44776119402985076)

Current size of match and non-match training data sets: 30 / 37

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 101 weight vectors
- Estimated match proportion 0.448

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 43 matches and 6 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.536
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)726_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990566
recall                 0.351171
f-measure              0.518519
da                          106
dm                            0
ndm                           0
tp                          105
fp                            1
tn                  4.76529e+07
fn                          194
Name: (10, 1 - acm diverg, 726), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)726_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 634
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 634 weight vectors
  Containing 154 true matches and 480 true non-matches
    (24.29% true matches)
  Identified 598 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   570  (95.32%)
          2 :    25  (4.18%)
          3 :     2  (0.33%)
          8 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 598 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 459

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 626
  Number of unique weight vectors: 597

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (597, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 597 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 597 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 26 matches and 57 non-matches
    Purity of oracle classification:  0.687
    Entropy of oracle classification: 0.897
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 514 weight vectors
  Based on 26 matches and 57 non-matches
  Classified 89 matches and 425 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (89, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)
    (425, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)

Current size of match and non-match training data sets: 26 / 57

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 425 weight vectors
- Estimated match proportion 0.313

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 425 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

106.0
Analisando o arquivo: diverg(15)970_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 970), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)970_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1064
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1064 weight vectors
  Containing 219 true matches and 845 true non-matches
    (20.58% true matches)
  Identified 1008 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   972  (96.43%)
          2 :    33  (3.27%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1008 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 824

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1063
  Number of unique weight vectors: 1008

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1008, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1008 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1008 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 921 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 325 matches and 596 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (325, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (596, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 325 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 325 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 41 matches and 28 non-matches
    Purity of oracle classification:  0.594
    Entropy of oracle classification: 0.974
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)887_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 887), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)887_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 146 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (538, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 538 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 538 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 9 matches and 65 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.534
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)814_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.977273
recall                 0.431438
f-measure              0.598608
da                          132
dm                            0
ndm                           0
tp                          129
fp                            3
tn                  4.76529e+07
fn                          170
Name: (10, 1 - acm diverg, 814), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)814_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 784
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 784 weight vectors
  Containing 127 true matches and 657 true non-matches
    (16.20% true matches)
  Identified 753 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   725  (96.28%)
          2 :    25  (3.32%)
          3 :     3  (0.40%)

Identified 0 non-pure unique weight vectors (from 753 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 116
     0.000 : 637

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 784
  Number of unique weight vectors: 753

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (753, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 753 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 753 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 668 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 113 matches and 555 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (113, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (555, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 555 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 555 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

132.0
Analisando o arquivo: diverg(15)847_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 847), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)847_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 597
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 597 weight vectors
  Containing 214 true matches and 383 true non-matches
    (35.85% true matches)
  Identified 563 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   548  (97.34%)
          2 :    12  (2.13%)
          3 :     2  (0.36%)
         19 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 563 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 382

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 596
  Number of unique weight vectors: 563

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (563, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 563 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 563 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 481 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 142 matches and 339 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (339, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 142 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 49 matches and 4 non-matches
    Purity of oracle classification:  0.925
    Entropy of oracle classification: 0.386
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)627_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 627), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)627_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 970
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 970 weight vectors
  Containing 219 true matches and 751 true non-matches
    (22.58% true matches)
  Identified 915 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   879  (96.07%)
          2 :    33  (3.61%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 915 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 730

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 969
  Number of unique weight vectors: 915

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (915, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 915 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 915 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 828 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 705 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (705, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 705 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 705 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)319_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 319), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)319_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 605
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 605 weight vectors
  Containing 154 true matches and 451 true non-matches
    (25.45% true matches)
  Identified 569 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   541  (95.08%)
          2 :    25  (4.39%)
          3 :     2  (0.35%)
          8 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 569 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 430

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 597
  Number of unique weight vectors: 568

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (568, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 568 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 568 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 25 matches and 57 non-matches
    Purity of oracle classification:  0.695
    Entropy of oracle classification: 0.887
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 486 weight vectors
  Based on 25 matches and 57 non-matches
  Classified 155 matches and 331 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6951219512195121, 0.8871723027673717, 0.3048780487804878)
    (331, 0.6951219512195121, 0.8871723027673717, 0.3048780487804878)

Current size of match and non-match training data sets: 25 / 57

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.89
- Size 331 weight vectors
- Estimated match proportion 0.305

Sample size for this cluster: 65

Farthest first selection of 65 weight vectors from 331 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.269, 0.478, 0.750, 0.385, 0.455] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.500, 0.571, 0.467, 0.467, 0.389] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.179, 0.500, 0.412, 0.357] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.800, 0.667, 0.381, 0.550, 0.429] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.571, 0.286, 0.333, 0.571, 0.600] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 0 matches and 65 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)707_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990385
recall                 0.344482
f-measure              0.511166
da                          104
dm                            0
ndm                           0
tp                          103
fp                            1
tn                  4.76529e+07
fn                          196
Name: (10, 1 - acm diverg, 707), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)707_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 343
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 343 weight vectors
  Containing 154 true matches and 189 true non-matches
    (44.90% true matches)
  Identified 325 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   315  (96.92%)
          2 :     7  (2.15%)
          3 :     2  (0.62%)
          8 :     1  (0.31%)

Identified 1 non-pure unique weight vectors (from 325 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 186

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 335
  Number of unique weight vectors: 324

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (324, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 324 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 324 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 28 matches and 46 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 250 weight vectors
  Based on 28 matches and 46 non-matches
  Classified 99 matches and 151 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 74
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (99, 0.6216216216216216, 0.9568886656798214, 0.3783783783783784)
    (151, 0.6216216216216216, 0.9568886656798214, 0.3783783783783784)

Current size of match and non-match training data sets: 28 / 46

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 151 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 11 matches and 46 non-matches
    Purity of oracle classification:  0.807
    Entropy of oracle classification: 0.708
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

104.0
Analisando o arquivo: diverg(20)124_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 124), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)124_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)815_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 815), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)815_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 396
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 396 weight vectors
  Containing 216 true matches and 180 true non-matches
    (54.55% true matches)
  Identified 363 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   347  (95.59%)
          2 :    13  (3.58%)
          3 :     2  (0.55%)
         17 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 363 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 179

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 395
  Number of unique weight vectors: 363

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (363, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 363 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 363 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 42 matches and 34 non-matches
    Purity of oracle classification:  0.553
    Entropy of oracle classification: 0.992
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  34
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 287 weight vectors
  Based on 42 matches and 34 non-matches
  Classified 146 matches and 141 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.5526315789473685, 0.9919924034538556, 0.5526315789473685)
    (141, 0.5526315789473685, 0.9919924034538556, 0.5526315789473685)

Current size of match and non-match training data sets: 42 / 34

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 146 weight vectors
- Estimated match proportion 0.553

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 52 matches and 6 non-matches
    Purity of oracle classification:  0.897
    Entropy of oracle classification: 0.480
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)274_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 274), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)274_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 587
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 587 weight vectors
  Containing 191 true matches and 396 true non-matches
    (32.54% true matches)
  Identified 558 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   543  (97.31%)
          2 :    12  (2.15%)
          3 :     2  (0.36%)
         14 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 558 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 393

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 586
  Number of unique weight vectors: 558

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (558, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 558 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 558 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 31 matches and 51 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 476 weight vectors
  Based on 31 matches and 51 non-matches
  Classified 136 matches and 340 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)
    (340, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)

Current size of match and non-match training data sets: 31 / 51

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 340 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 340 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [0.790, 0.000, 0.636, 0.619, 0.429, 0.450, 0.609] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.778, 0.429, 0.571, 0.750, 0.600] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.481, 0.643, 0.667, 0.350, 0.643] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 3 matches and 68 non-matches
    Purity of oracle classification:  0.958
    Entropy of oracle classification: 0.253
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

65.0
Analisando o arquivo: diverg(10)346_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 346), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)346_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 730
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 730 weight vectors
  Containing 220 true matches and 510 true non-matches
    (30.14% true matches)
  Identified 694 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   675  (97.26%)
          2 :    16  (2.31%)
          3 :     2  (0.29%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 694 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 507

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 729
  Number of unique weight vectors: 694

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (694, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 694 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 694 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 33 matches and 51 non-matches
    Purity of oracle classification:  0.607
    Entropy of oracle classification: 0.967
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 610 weight vectors
  Based on 33 matches and 51 non-matches
  Classified 159 matches and 451 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)
    (451, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)

Current size of match and non-match training data sets: 33 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.97
- Size 451 weight vectors
- Estimated match proportion 0.393

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 451 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 3 matches and 73 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.240
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)533_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 533), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)533_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 351
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 351 weight vectors
  Containing 192 true matches and 159 true non-matches
    (54.70% true matches)
  Identified 319 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   304  (95.30%)
          2 :    12  (3.76%)
          3 :     2  (0.63%)
         17 :     1  (0.31%)

Identified 1 non-pure unique weight vectors (from 319 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 156

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 350
  Number of unique weight vectors: 319

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (319, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 319 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 319 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 35 matches and 39 non-matches
    Purity of oracle classification:  0.527
    Entropy of oracle classification: 0.998
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 245 weight vectors
  Based on 35 matches and 39 non-matches
  Classified 149 matches and 96 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 74
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.527027027027027, 0.9978913098356863, 0.47297297297297297)
    (96, 0.527027027027027, 0.9978913098356863, 0.47297297297297297)

Current size of match and non-match training data sets: 35 / 39

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 149 weight vectors
- Estimated match proportion 0.473

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.564, 1.000, 0.200, 0.170, 0.192, 0.176, 0.244] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.367, 1.000, 0.154, 0.174, 0.125, 0.240, 0.226] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.242, 0.121, 0.200, 0.171, 0.000] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 42 matches and 16 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  16
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(10)870_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 870), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)870_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 873
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 873 weight vectors
  Containing 155 true matches and 718 true non-matches
    (17.75% true matches)
  Identified 837 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   809  (96.65%)
          2 :    25  (2.99%)
          3 :     2  (0.24%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 837 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 139
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 697

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 865
  Number of unique weight vectors: 836

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (836, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 836 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 836 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 750 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 88 matches and 662 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (88, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (662, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 662 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 662 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 13 matches and 58 non-matches
    Purity of oracle classification:  0.817
    Entropy of oracle classification: 0.687
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(15)501_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (15, 1 - acm diverg, 501), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)501_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 406
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 406 weight vectors
  Containing 142 true matches and 264 true non-matches
    (34.98% true matches)
  Identified 390 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   379  (97.18%)
          2 :     8  (2.05%)
          3 :     2  (0.51%)
          5 :     1  (0.26%)

Identified 0 non-pure unique weight vectors (from 390 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 128
     0.000 : 262

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 406
  Number of unique weight vectors: 390

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (390, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 390 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 390 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 32 matches and 45 non-matches
    Purity of oracle classification:  0.584
    Entropy of oracle classification: 0.979
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 313 weight vectors
  Based on 32 matches and 45 non-matches
  Classified 89 matches and 224 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (89, 0.5844155844155844, 0.9793399259567799, 0.4155844155844156)
    (224, 0.5844155844155844, 0.9793399259567799, 0.4155844155844156)

Current size of match and non-match training data sets: 32 / 45

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 89 weight vectors
- Estimated match proportion 0.416

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 89 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 42 matches and 4 non-matches
    Purity of oracle classification:  0.913
    Entropy of oracle classification: 0.426
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(20)910_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 910), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)910_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)214_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 214), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)214_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 407
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 407 weight vectors
  Containing 217 true matches and 190 true non-matches
    (53.32% true matches)
  Identified 370 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   352  (95.14%)
          2 :    15  (4.05%)
          3 :     2  (0.54%)
         19 :     1  (0.27%)

Identified 1 non-pure unique weight vectors (from 370 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 187

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 406
  Number of unique weight vectors: 370

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (370, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 370 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 370 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 29 matches and 47 non-matches
    Purity of oracle classification:  0.618
    Entropy of oracle classification: 0.959
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 294 weight vectors
  Based on 29 matches and 47 non-matches
  Classified 145 matches and 149 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.618421052631579, 0.959149554396894, 0.3815789473684211)
    (149, 0.618421052631579, 0.959149554396894, 0.3815789473684211)

Current size of match and non-match training data sets: 29 / 47

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 149 weight vectors
- Estimated match proportion 0.382

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 9 matches and 48 non-matches
    Purity of oracle classification:  0.842
    Entropy of oracle classification: 0.629
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)798_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 798), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)798_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1050
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1050 weight vectors
  Containing 208 true matches and 842 true non-matches
    (19.81% true matches)
  Identified 1003 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   968  (96.51%)
          2 :    32  (3.19%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1003 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 821

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1049
  Number of unique weight vectors: 1003

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1003, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1003 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1003 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 916 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (793, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 123 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 46 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.149
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)705_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979592
recall                  0.32107
f-measure              0.483627
da                           98
dm                            0
ndm                           0
tp                           96
fp                            2
tn                  4.76529e+07
fn                          203
Name: (15, 1 - acm diverg, 705), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)705_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 678
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 678 weight vectors
  Containing 167 true matches and 511 true non-matches
    (24.63% true matches)
  Identified 659 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   646  (98.03%)
          2 :    10  (1.52%)
          3 :     2  (0.30%)
          6 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 659 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 150
     0.000 : 509

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 678
  Number of unique weight vectors: 659

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (659, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 659 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 659 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 32 matches and 52 non-matches
    Purity of oracle classification:  0.619
    Entropy of oracle classification: 0.959
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 575 weight vectors
  Based on 32 matches and 52 non-matches
  Classified 114 matches and 461 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)
    (461, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)

Current size of match and non-match training data sets: 32 / 52

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 114 weight vectors
- Estimated match proportion 0.381

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 114 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 45 matches and 6 non-matches
    Purity of oracle classification:  0.882
    Entropy of oracle classification: 0.523
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

98.0
Analisando o arquivo: diverg(15)741_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 741), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)741_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 781
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 781 weight vectors
  Containing 222 true matches and 559 true non-matches
    (28.43% true matches)
  Identified 727 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   690  (94.91%)
          2 :    34  (4.68%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 727 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 538

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 780
  Number of unique weight vectors: 727

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (727, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 727 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 727 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 32 matches and 53 non-matches
    Purity of oracle classification:  0.624
    Entropy of oracle classification: 0.956
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 642 weight vectors
  Based on 32 matches and 53 non-matches
  Classified 301 matches and 341 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (301, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)
    (341, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)

Current size of match and non-match training data sets: 32 / 53

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 301 weight vectors
- Estimated match proportion 0.376

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 301 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 44 matches and 25 non-matches
    Purity of oracle classification:  0.638
    Entropy of oracle classification: 0.945
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  25
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)134_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 134), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)134_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 934
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 934 weight vectors
  Containing 217 true matches and 717 true non-matches
    (23.23% true matches)
  Identified 879 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   843  (95.90%)
          2 :    33  (3.75%)
          3 :     2  (0.23%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 879 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 696

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 933
  Number of unique weight vectors: 879

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (879, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 879 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 879 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 793 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 133 matches and 660 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (660, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 133 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 133 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 49 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.141
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)8_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 8), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)8_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 187 true matches and 865 true non-matches
    (17.78% true matches)
  Identified 1010 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   979  (96.93%)
          2 :    28  (2.77%)
          3 :     2  (0.20%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1010 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 844

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 1010

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1010, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1010 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1010 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 923 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 86 matches and 837 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (86, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (837, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 86 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 41

Farthest first selection of 41 weight vectors from 86 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 41 weight vectors
  The oracle will correctly classify 41 weight vectors and wrongly classify 0
  Classified 41 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 41 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)890_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 890), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)890_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1073
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1073 weight vectors
  Containing 226 true matches and 847 true non-matches
    (21.06% true matches)
  Identified 1016 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   979  (96.36%)
          2 :    34  (3.35%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1016 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 826

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1072
  Number of unique weight vectors: 1016

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1016, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1016 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1016 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 929 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 332 matches and 597 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (332, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (597, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 597 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 597 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.692, 0.583, 0.500, 0.750, 0.731] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)841_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 841), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)841_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 295
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 295 weight vectors
  Containing 186 true matches and 109 true non-matches
    (63.05% true matches)
  Identified 272 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   260  (95.59%)
          2 :     9  (3.31%)
          3 :     2  (0.74%)
         11 :     1  (0.37%)

Identified 1 non-pure unique weight vectors (from 272 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 163
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 108

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 294
  Number of unique weight vectors: 272

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (272, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 272 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Perform initial selection using "far" method

Farthest first selection of 71 weight vectors from 272 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 34 matches and 37 non-matches
    Purity of oracle classification:  0.521
    Entropy of oracle classification: 0.999
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 201 weight vectors
  Based on 34 matches and 37 non-matches
  Classified 140 matches and 61 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 71
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.5211267605633803, 0.9987117514654895, 0.4788732394366197)
    (61, 0.5211267605633803, 0.9987117514654895, 0.4788732394366197)

Current size of match and non-match training data sets: 34 / 37

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 140 weight vectors
- Estimated match proportion 0.479

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 49 matches and 8 non-matches
    Purity of oracle classification:  0.860
    Entropy of oracle classification: 0.585
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)774_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 774), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)774_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)530_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 530), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)530_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1092
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1092 weight vectors
  Containing 226 true matches and 866 true non-matches
    (20.70% true matches)
  Identified 1035 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   998  (96.43%)
          2 :    34  (3.29%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1035 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1091
  Number of unique weight vectors: 1035

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1035, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1035 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1035 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 947 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 816 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (816, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 131 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)665_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 665), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)665_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 845
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 845 weight vectors
  Containing 227 true matches and 618 true non-matches
    (26.86% true matches)
  Identified 788 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   751  (95.30%)
          2 :    34  (4.31%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 788 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 597

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 844
  Number of unique weight vectors: 788

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (788, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 788 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 788 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 703 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 162 matches and 541 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (541, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 162 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)423_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 423), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)423_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 179 matches and 760 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (760, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 179 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 179 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 43 matches and 15 non-matches
    Purity of oracle classification:  0.741
    Entropy of oracle classification: 0.825
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  15
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)411_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 411), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)411_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 569
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 569 weight vectors
  Containing 186 true matches and 383 true non-matches
    (32.69% true matches)
  Identified 540 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   525  (97.22%)
          2 :    12  (2.22%)
          3 :     2  (0.37%)
         14 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 540 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 380

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 568
  Number of unique weight vectors: 540

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (540, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 540 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 540 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 30 matches and 51 non-matches
    Purity of oracle classification:  0.630
    Entropy of oracle classification: 0.951
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 459 weight vectors
  Based on 30 matches and 51 non-matches
  Classified 137 matches and 322 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.6296296296296297, 0.9509560484549725, 0.37037037037037035)
    (322, 0.6296296296296297, 0.9509560484549725, 0.37037037037037035)

Current size of match and non-match training data sets: 30 / 51

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 322 weight vectors
- Estimated match proportion 0.370

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 322 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [0.790, 0.000, 0.636, 0.619, 0.429, 0.450, 0.609] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.778, 0.429, 0.571, 0.750, 0.600] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.481, 0.643, 0.667, 0.350, 0.643] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 0 matches and 70 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

65.0
Analisando o arquivo: diverg(15)296_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (15, 1 - acm diverg, 296), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)296_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1006
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1006 weight vectors
  Containing 166 true matches and 840 true non-matches
    (16.50% true matches)
  Identified 967 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   938  (97.00%)
          2 :    26  (2.69%)
          3 :     2  (0.21%)
         10 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 967 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 147
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 819

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1005
  Number of unique weight vectors: 967

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (967, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 967 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 967 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 880 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 83 matches and 797 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (83, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (797, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 83 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 40

Farthest first selection of 40 weight vectors from 83 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 40 weight vectors
  The oracle will correctly classify 40 weight vectors and wrongly classify 0
  Classified 39 matches and 1 non-matches
    Purity of oracle classification:  0.975
    Entropy of oracle classification: 0.169
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 40 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(10)958_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 958), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)958_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 695
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 695 weight vectors
  Containing 202 true matches and 493 true non-matches
    (29.06% true matches)
  Identified 669 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   655  (97.91%)
          2 :    11  (1.64%)
          3 :     2  (0.30%)
         12 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 669 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 492

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 694
  Number of unique weight vectors: 669

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (669, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 669 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 669 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 32 matches and 52 non-matches
    Purity of oracle classification:  0.619
    Entropy of oracle classification: 0.959
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 585 weight vectors
  Based on 32 matches and 52 non-matches
  Classified 142 matches and 443 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)
    (443, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)

Current size of match and non-match training data sets: 32 / 52

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 443 weight vectors
- Estimated match proportion 0.381

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 443 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.750, 0.667, 0.444, 0.765, 0.714] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.818, 0.762, 0.714, 0.500, 0.400] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.455, 0.714, 0.429, 0.550, 0.647] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 4 matches and 71 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.300
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)795_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 795), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)795_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1074
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1074 weight vectors
  Containing 208 true matches and 866 true non-matches
    (19.37% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   992  (96.59%)
          2 :    32  (3.12%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1073
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 121 matches and 818 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (121, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (818, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 121 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 121 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 46 matches and 2 non-matches
    Purity of oracle classification:  0.958
    Entropy of oracle classification: 0.250
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)519_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 519), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)519_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 680
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 680 weight vectors
  Containing 198 true matches and 482 true non-matches
    (29.12% true matches)
  Identified 635 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   601  (94.65%)
          2 :    31  (4.88%)
          3 :     2  (0.31%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 635 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 461

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 679
  Number of unique weight vectors: 635

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (635, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 635 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 635 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 30 matches and 53 non-matches
    Purity of oracle classification:  0.639
    Entropy of oracle classification: 0.944
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 552 weight vectors
  Based on 30 matches and 53 non-matches
  Classified 194 matches and 358 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (194, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)
    (358, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)

Current size of match and non-match training data sets: 30 / 53

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 358 weight vectors
- Estimated match proportion 0.361

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 358 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.700, 0.429, 0.476, 0.647, 0.810] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.857, 0.875, 0.625, 0.333, 0.667] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)97_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 97), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)97_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)647_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 647), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)647_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 283
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 283 weight vectors
  Containing 195 true matches and 88 true non-matches
    (68.90% true matches)
  Identified 254 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   242  (95.28%)
          2 :     9  (3.54%)
          3 :     2  (0.79%)
         17 :     1  (0.39%)

Identified 1 non-pure unique weight vectors (from 254 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 87

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 282
  Number of unique weight vectors: 254

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (254, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 254 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 70

Perform initial selection using "far" method

Farthest first selection of 70 weight vectors from 254 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 38 matches and 32 non-matches
    Purity of oracle classification:  0.543
    Entropy of oracle classification: 0.995
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  32
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 184 weight vectors
  Based on 38 matches and 32 non-matches
  Classified 136 matches and 48 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 70
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.5428571428571428, 0.9946937953613058, 0.5428571428571428)
    (48, 0.5428571428571428, 0.9946937953613058, 0.5428571428571428)

Current size of match and non-match training data sets: 38 / 32

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 0.99
- Size 136 weight vectors
- Estimated match proportion 0.543

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.971, 0.952, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 46 matches and 10 non-matches
    Purity of oracle classification:  0.821
    Entropy of oracle classification: 0.677
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  10
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)144_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 144), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)144_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 526
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 526 weight vectors
  Containing 208 true matches and 318 true non-matches
    (39.54% true matches)
  Identified 497 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   480  (96.58%)
          2 :    14  (2.82%)
          3 :     2  (0.40%)
         12 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 497 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 315

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 525
  Number of unique weight vectors: 497

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (497, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 497 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 497 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 35 matches and 45 non-matches
    Purity of oracle classification:  0.562
    Entropy of oracle classification: 0.989
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 417 weight vectors
  Based on 35 matches and 45 non-matches
  Classified 142 matches and 275 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.5625, 0.9886994082884974, 0.4375)
    (275, 0.5625, 0.9886994082884974, 0.4375)

Current size of match and non-match training data sets: 35 / 45

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 275 weight vectors
- Estimated match proportion 0.438

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 275 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 5 matches and 65 non-matches
    Purity of oracle classification:  0.929
    Entropy of oracle classification: 0.371
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)448_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 448), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)448_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 738
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 738 weight vectors
  Containing 217 true matches and 521 true non-matches
    (29.40% true matches)
  Identified 703 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   688  (97.87%)
          2 :    12  (1.71%)
          3 :     2  (0.28%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 703 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 520

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 737
  Number of unique weight vectors: 703

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (703, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 703 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 703 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 619 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 129 matches and 490 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (129, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (490, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 129 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 129 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.420, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 48 matches and 3 non-matches
    Purity of oracle classification:  0.941
    Entropy of oracle classification: 0.323
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)122_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 122), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)122_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 891
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 891 weight vectors
  Containing 199 true matches and 692 true non-matches
    (22.33% true matches)
  Identified 836 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   800  (95.69%)
          2 :    33  (3.95%)
          3 :     2  (0.24%)
         19 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 836 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 671

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 890
  Number of unique weight vectors: 836

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (836, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 836 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 836 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 750 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 184 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (184, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 0 matches and 73 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)800_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 800), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)800_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 268
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 268 weight vectors
  Containing 152 true matches and 116 true non-matches
    (56.72% true matches)
  Identified 253 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   246  (97.23%)
          2 :     4  (1.58%)
          3 :     2  (0.79%)
          8 :     1  (0.40%)

Identified 1 non-pure unique weight vectors (from 253 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 137
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 115

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 260
  Number of unique weight vectors: 252

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (252, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 252 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 70

Perform initial selection using "far" method

Farthest first selection of 70 weight vectors from 252 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 35 matches and 35 non-matches
    Purity of oracle classification:  0.500
    Entropy of oracle classification: 1.000
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  35
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 182 weight vectors
  Based on 35 matches and 35 non-matches
  Classified 107 matches and 75 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 70
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (107, 0.5, 1.0, 0.5)
    (75, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 35 / 35

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 75 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 75 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 4 matches and 38 non-matches
    Purity of oracle classification:  0.905
    Entropy of oracle classification: 0.454
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)338_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 338), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)338_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 851
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 851 weight vectors
  Containing 154 true matches and 697 true non-matches
    (18.10% true matches)
  Identified 815 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   787  (96.56%)
          2 :    25  (3.07%)
          3 :     2  (0.25%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 815 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 676

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 843
  Number of unique weight vectors: 814

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (814, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 814 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 814 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 32 matches and 54 non-matches
    Purity of oracle classification:  0.628
    Entropy of oracle classification: 0.952
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 728 weight vectors
  Based on 32 matches and 54 non-matches
  Classified 156 matches and 572 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.627906976744186, 0.9522656254366642, 0.37209302325581395)
    (572, 0.627906976744186, 0.9522656254366642, 0.37209302325581395)

Current size of match and non-match training data sets: 32 / 54

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 572 weight vectors
- Estimated match proportion 0.372

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 572 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)135_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 135), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)135_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 547
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 547 weight vectors
  Containing 166 true matches and 381 true non-matches
    (30.35% true matches)
  Identified 529 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   520  (98.30%)
          2 :     6  (1.13%)
          3 :     2  (0.38%)
          9 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 529 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 380

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 538
  Number of unique weight vectors: 528

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (528, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 528 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 528 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 28 matches and 53 non-matches
    Purity of oracle classification:  0.654
    Entropy of oracle classification: 0.930
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 447 weight vectors
  Based on 28 matches and 53 non-matches
  Classified 124 matches and 323 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (124, 0.654320987654321, 0.9301497323974337, 0.345679012345679)
    (323, 0.654320987654321, 0.9301497323974337, 0.345679012345679)

Current size of match and non-match training data sets: 28 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 323 weight vectors
- Estimated match proportion 0.346

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 323 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.400, 0.733, 0.667, 0.647, 0.737] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.800, 0.684, 0.667, 0.529, 0.609] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 1 matches and 67 non-matches
    Purity of oracle classification:  0.985
    Entropy of oracle classification: 0.111
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(20)856_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 856), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)856_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1059
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1059 weight vectors
  Containing 227 true matches and 832 true non-matches
    (21.44% true matches)
  Identified 1002 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   965  (96.31%)
          2 :    34  (3.39%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1002 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 811

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1058
  Number of unique weight vectors: 1002

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1002, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1002 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1002 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 915 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 177 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (177, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (738, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 738 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)175_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 175), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)175_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 617
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 617 weight vectors
  Containing 156 true matches and 461 true non-matches
    (25.28% true matches)
  Identified 581 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   553  (95.18%)
          2 :    25  (4.30%)
          3 :     2  (0.34%)
          8 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 581 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 140
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 440

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 609
  Number of unique weight vectors: 580

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (580, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 580 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 580 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 28 matches and 54 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 498 weight vectors
  Based on 28 matches and 54 non-matches
  Classified 119 matches and 379 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (119, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)
    (379, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)

Current size of match and non-match training data sets: 28 / 54

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 379 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 379 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.462, 0.609, 0.684, 0.308, 0.545] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 2 matches and 68 non-matches
    Purity of oracle classification:  0.971
    Entropy of oracle classification: 0.187
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)724_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 724), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)724_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 746
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 746 weight vectors
  Containing 220 true matches and 526 true non-matches
    (29.49% true matches)
  Identified 692 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   655  (94.65%)
          2 :    34  (4.91%)
          3 :     2  (0.29%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 692 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 505

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 745
  Number of unique weight vectors: 692

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (692, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 692 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 692 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 608 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 300 matches and 308 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (300, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (308, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 300 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 300 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.261, 0.174, 0.148, 0.186, 0.148] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.600, 1.000, 0.217, 0.132, 0.167, 0.125, 0.188] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 42 matches and 27 non-matches
    Purity of oracle classification:  0.609
    Entropy of oracle classification: 0.966
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  27
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)935_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 935), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)935_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1069
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1069 weight vectors
  Containing 221 true matches and 848 true non-matches
    (20.67% true matches)
  Identified 1013 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   977  (96.45%)
          2 :    33  (3.26%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1013 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1068
  Number of unique weight vectors: 1013

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1013, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1013 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1013 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 926 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 106 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 106 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 106 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 44 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)424_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 424), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)424_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 766
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 766 weight vectors
  Containing 187 true matches and 579 true non-matches
    (24.41% true matches)
  Identified 742 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   729  (98.25%)
          2 :    10  (1.35%)
          3 :     2  (0.27%)
         11 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 742 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 576

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 765
  Number of unique weight vectors: 742

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (742, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 742 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 742 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 657 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 127 matches and 530 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (127, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (530, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 127 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 127 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 48 matches and 4 non-matches
    Purity of oracle classification:  0.923
    Entropy of oracle classification: 0.391
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(10)599_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 599), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)599_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1011
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1011 weight vectors
  Containing 196 true matches and 815 true non-matches
    (19.39% true matches)
  Identified 969 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   934  (96.39%)
          2 :    32  (3.30%)
          3 :     2  (0.21%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 969 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 795

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1011
  Number of unique weight vectors: 969

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (969, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 969 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 969 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 882 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 98 matches and 784 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (784, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 784 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 784 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 16 matches and 55 non-matches
    Purity of oracle classification:  0.775
    Entropy of oracle classification: 0.770
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)111_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 111), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)111_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 605
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 605 weight vectors
  Containing 154 true matches and 451 true non-matches
    (25.45% true matches)
  Identified 569 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   541  (95.08%)
          2 :    25  (4.39%)
          3 :     2  (0.35%)
          8 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 569 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 430

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 597
  Number of unique weight vectors: 568

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (568, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 568 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 568 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 25 matches and 57 non-matches
    Purity of oracle classification:  0.695
    Entropy of oracle classification: 0.887
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 486 weight vectors
  Based on 25 matches and 57 non-matches
  Classified 155 matches and 331 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6951219512195121, 0.8871723027673717, 0.3048780487804878)
    (331, 0.6951219512195121, 0.8871723027673717, 0.3048780487804878)

Current size of match and non-match training data sets: 25 / 57

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.89
- Size 331 weight vectors
- Estimated match proportion 0.305

Sample size for this cluster: 65

Farthest first selection of 65 weight vectors from 331 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.269, 0.478, 0.750, 0.385, 0.455] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.500, 0.571, 0.467, 0.467, 0.389] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.179, 0.500, 0.412, 0.357] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.800, 0.667, 0.381, 0.550, 0.429] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.571, 0.286, 0.333, 0.571, 0.600] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 0 matches and 65 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(20)822_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 822), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)822_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 969
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 969 weight vectors
  Containing 219 true matches and 750 true non-matches
    (22.60% true matches)
  Identified 914 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   878  (96.06%)
          2 :    33  (3.61%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 914 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 729

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 968
  Number of unique weight vectors: 914

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (914, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 914 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 914 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 827 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 704 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (704, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 123 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)846_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 846), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)846_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1058
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1058 weight vectors
  Containing 209 true matches and 849 true non-matches
    (19.75% true matches)
  Identified 1011 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   976  (96.54%)
          2 :    32  (3.17%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1011 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1057
  Number of unique weight vectors: 1011

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1011, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1011 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1011 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 924 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 104 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (104, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)729_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 729), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)729_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 707
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 707 weight vectors
  Containing 208 true matches and 499 true non-matches
    (29.42% true matches)
  Identified 673 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   656  (97.47%)
          2 :    14  (2.08%)
          3 :     2  (0.30%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 673 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 496

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 706
  Number of unique weight vectors: 673

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (673, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 673 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 673 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 35 matches and 49 non-matches
    Purity of oracle classification:  0.583
    Entropy of oracle classification: 0.980
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 589 weight vectors
  Based on 35 matches and 49 non-matches
  Classified 279 matches and 310 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (279, 0.5833333333333334, 0.9798687566511527, 0.4166666666666667)
    (310, 0.5833333333333334, 0.9798687566511527, 0.4166666666666667)

Current size of match and non-match training data sets: 35 / 49

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 279 weight vectors
- Estimated match proportion 0.417

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 279 vectors
  The selected farthest weight vectors are:
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 46 matches and 24 non-matches
    Purity of oracle classification:  0.657
    Entropy of oracle classification: 0.928
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  24
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)352_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 352), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)352_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)561_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 561), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)561_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 377
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 377 weight vectors
  Containing 195 true matches and 182 true non-matches
    (51.72% true matches)
  Identified 350 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   334  (95.43%)
          2 :    13  (3.71%)
          3 :     2  (0.57%)
         11 :     1  (0.29%)

Identified 1 non-pure unique weight vectors (from 350 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 179

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 376
  Number of unique weight vectors: 350

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (350, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 350 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 350 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 31 matches and 44 non-matches
    Purity of oracle classification:  0.587
    Entropy of oracle classification: 0.978
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 275 weight vectors
  Based on 31 matches and 44 non-matches
  Classified 143 matches and 132 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.5866666666666667, 0.9782176659354248, 0.41333333333333333)
    (132, 0.5866666666666667, 0.9782176659354248, 0.41333333333333333)

Current size of match and non-match training data sets: 31 / 44

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 143 weight vectors
- Estimated match proportion 0.413

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 143 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 51 matches and 6 non-matches
    Purity of oracle classification:  0.895
    Entropy of oracle classification: 0.485
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)896_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 896), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)896_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1041
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1041 weight vectors
  Containing 213 true matches and 828 true non-matches
    (20.46% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   954  (96.46%)
          2 :    32  (3.24%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 807

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1040
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 44 matches and 858 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (44, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (858, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 858 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 858 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.000, 0.556, 0.182, 0.500, 0.071, 0.400] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.233, 0.545, 0.714, 0.455, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 17 matches and 56 non-matches
    Purity of oracle classification:  0.767
    Entropy of oracle classification: 0.783
    Number of true matches:      17
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)304_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 304), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)304_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 755
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 755 weight vectors
  Containing 203 true matches and 552 true non-matches
    (26.89% true matches)
  Identified 726 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   709  (97.66%)
          2 :    14  (1.93%)
          3 :     2  (0.28%)
         12 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 726 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 549

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 754
  Number of unique weight vectors: 726

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (726, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 726 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 726 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 36 matches and 49 non-matches
    Purity of oracle classification:  0.576
    Entropy of oracle classification: 0.983
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 641 weight vectors
  Based on 36 matches and 49 non-matches
  Classified 307 matches and 334 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (307, 0.5764705882352941, 0.9830605548016025, 0.4235294117647059)
    (334, 0.5764705882352941, 0.9830605548016025, 0.4235294117647059)

Current size of match and non-match training data sets: 36 / 49

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 334 weight vectors
- Estimated match proportion 0.424

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 334 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.750, 0.905, 0.667, 0.500, 0.571] (False)
    [1.000, 0.000, 0.579, 0.583, 0.522, 0.417, 0.563] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.769, 0.679, 0.412, 0.591, 0.500] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.556, 0.429, 0.500, 0.700, 0.643] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.500, 0.600, 0.353, 0.611, 0.526] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 0 matches and 73 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)100_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 100), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)100_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 395
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 395 weight vectors
  Containing 213 true matches and 182 true non-matches
    (53.92% true matches)
  Identified 358 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   340  (94.97%)
          2 :    15  (4.19%)
          3 :     2  (0.56%)
         19 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 358 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 179

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 394
  Number of unique weight vectors: 358

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (358, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 358 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 358 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 31 matches and 45 non-matches
    Purity of oracle classification:  0.592
    Entropy of oracle classification: 0.975
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 282 weight vectors
  Based on 31 matches and 45 non-matches
  Classified 151 matches and 131 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.5921052631578947, 0.9753817903274212, 0.40789473684210525)
    (131, 0.5921052631578947, 0.9753817903274212, 0.40789473684210525)

Current size of match and non-match training data sets: 31 / 45

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 131 weight vectors
- Estimated match proportion 0.408

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 3 matches and 51 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)467_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 467), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)467_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(15)56_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (15, 1 - acm diverg, 56), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)56_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 771
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 771 weight vectors
  Containing 203 true matches and 568 true non-matches
    (26.33% true matches)
  Identified 721 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   687  (95.28%)
          2 :    31  (4.30%)
          3 :     2  (0.28%)
         16 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 721 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 547

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 770
  Number of unique weight vectors: 721

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (721, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 721 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 721 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 637 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 140 matches and 497 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (497, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 140 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 47 matches and 7 non-matches
    Purity of oracle classification:  0.870
    Entropy of oracle classification: 0.556
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(15)591_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979592
recall                  0.32107
f-measure              0.483627
da                           98
dm                            0
ndm                           0
tp                           96
fp                            2
tn                  4.76529e+07
fn                          203
Name: (15, 1 - acm diverg, 591), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)591_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 678
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 678 weight vectors
  Containing 167 true matches and 511 true non-matches
    (24.63% true matches)
  Identified 659 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   646  (98.03%)
          2 :    10  (1.52%)
          3 :     2  (0.30%)
          6 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 659 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 150
     0.000 : 509

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 678
  Number of unique weight vectors: 659

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (659, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 659 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 659 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 32 matches and 52 non-matches
    Purity of oracle classification:  0.619
    Entropy of oracle classification: 0.959
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 575 weight vectors
  Based on 32 matches and 52 non-matches
  Classified 114 matches and 461 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)
    (461, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)

Current size of match and non-match training data sets: 32 / 52

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 461 weight vectors
- Estimated match proportion 0.381

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 461 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 3 matches and 73 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.240
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

98.0
Analisando o arquivo: diverg(10)517_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (10, 1 - acm diverg, 517), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)517_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 710
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 710 weight vectors
  Containing 141 true matches and 569 true non-matches
    (19.86% true matches)
  Identified 676 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   647  (95.71%)
          2 :    26  (3.85%)
          3 :     2  (0.30%)
          5 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 676 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 127
     0.000 : 549

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 710
  Number of unique weight vectors: 676

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (676, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 676 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 676 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 25 matches and 59 non-matches
    Purity of oracle classification:  0.702
    Entropy of oracle classification: 0.878
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 592 weight vectors
  Based on 25 matches and 59 non-matches
  Classified 87 matches and 505 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (87, 0.7023809523809523, 0.8783609387702276, 0.2976190476190476)
    (505, 0.7023809523809523, 0.8783609387702276, 0.2976190476190476)

Current size of match and non-match training data sets: 25 / 59

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 87 weight vectors
- Estimated match proportion 0.298

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 87 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 41 matches and 1 non-matches
    Purity of oracle classification:  0.976
    Entropy of oracle classification: 0.162
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(15)642_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 642), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)642_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 855
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 855 weight vectors
  Containing 221 true matches and 634 true non-matches
    (25.85% true matches)
  Identified 799 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   763  (95.49%)
          2 :    33  (4.13%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 799 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 854
  Number of unique weight vectors: 799

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (799, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 799 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 799 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 714 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (564, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 564 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 564 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 5 matches and 69 non-matches
    Purity of oracle classification:  0.932
    Entropy of oracle classification: 0.357
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)680_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 680), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)680_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)622_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 622), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)622_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 455
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 455 weight vectors
  Containing 219 true matches and 236 true non-matches
    (48.13% true matches)
  Identified 419 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   400  (95.47%)
          2 :    16  (3.82%)
          3 :     2  (0.48%)
         17 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 419 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 233

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 454
  Number of unique weight vectors: 419

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (419, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 419 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 419 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 40 matches and 38 non-matches
    Purity of oracle classification:  0.513
    Entropy of oracle classification: 1.000
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 341 weight vectors
  Based on 40 matches and 38 non-matches
  Classified 282 matches and 59 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (282, 0.5128205128205128, 0.9995256892936493, 0.5128205128205128)
    (59, 0.5128205128205128, 0.9995256892936493, 0.5128205128205128)

Current size of match and non-match training data sets: 40 / 38

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 282 weight vectors
- Estimated match proportion 0.513

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 282 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 44 matches and 28 non-matches
    Purity of oracle classification:  0.611
    Entropy of oracle classification: 0.964
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)575_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 575), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)575_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 883
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 883 weight vectors
  Containing 212 true matches and 671 true non-matches
    (24.01% true matches)
  Identified 831 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   795  (95.67%)
          2 :    33  (3.97%)
          3 :     2  (0.24%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 831 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 650

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 882
  Number of unique weight vectors: 831

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (831, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 831 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 831 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 745 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 163 matches and 582 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (163, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (582, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 582 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 582 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)574_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 574), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)574_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 860
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 860 weight vectors
  Containing 191 true matches and 669 true non-matches
    (22.21% true matches)
  Identified 813 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   780  (95.94%)
          2 :    30  (3.69%)
          3 :     2  (0.25%)
         14 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 813 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 648

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 859
  Number of unique weight vectors: 813

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (813, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 813 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 813 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 727 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 146 matches and 581 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (581, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 146 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 45 matches and 9 non-matches
    Purity of oracle classification:  0.833
    Entropy of oracle classification: 0.650
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

65.0
Analisando o arquivo: diverg(15)724_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 724), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)724_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 812
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 812 weight vectors
  Containing 226 true matches and 586 true non-matches
    (27.83% true matches)
  Identified 755 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   718  (95.10%)
          2 :    34  (4.50%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 755 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 565

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 811
  Number of unique weight vectors: 755

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (755, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 755 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 755 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 670 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 165 matches and 505 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (165, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (505, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 165 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 165 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 47 matches and 10 non-matches
    Purity of oracle classification:  0.825
    Entropy of oracle classification: 0.670
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  10
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)790_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 790), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)790_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 696
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 696 weight vectors
  Containing 208 true matches and 488 true non-matches
    (29.89% true matches)
  Identified 660 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   644  (97.58%)
          2 :    13  (1.97%)
          3 :     2  (0.30%)
         20 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 660 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 487

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 695
  Number of unique weight vectors: 660

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (660, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 660 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 660 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.667, 0.571, 0.500, 0.625] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 34 matches and 50 non-matches
    Purity of oracle classification:  0.595
    Entropy of oracle classification: 0.974
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 576 weight vectors
  Based on 34 matches and 50 non-matches
  Classified 304 matches and 272 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (304, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)
    (272, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)

Current size of match and non-match training data sets: 34 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 272 weight vectors
- Estimated match proportion 0.405

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 272 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.667, 0.857, 0.353, 0.632, 0.550] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 0.000, 0.864, 0.667, 0.435, 0.700, 0.600] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.462, 0.609, 0.643, 0.706, 0.786] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.950, 0.000, 0.619, 0.800, 0.478, 0.280, 0.625] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 0 matches and 69 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)948_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 948), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)948_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 984
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 984 weight vectors
  Containing 211 true matches and 773 true non-matches
    (21.44% true matches)
  Identified 932 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   897  (96.24%)
          2 :    32  (3.43%)
          3 :     2  (0.21%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 932 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 752

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 983
  Number of unique weight vectors: 932

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (932, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 932 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 932 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 845 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 292 matches and 553 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (292, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (553, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 292 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 292 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.600, 1.000, 0.217, 0.132, 0.167, 0.125, 0.188] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 42 matches and 26 non-matches
    Purity of oracle classification:  0.618
    Entropy of oracle classification: 0.960
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)742_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 742), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)742_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)874_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.983871
recall                 0.204013
f-measure               0.33795
da                           62
dm                            0
ndm                           0
tp                           61
fp                            1
tn                  4.76529e+07
fn                          238
Name: (10, 1 - acm diverg, 874), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)874_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 661
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 661 weight vectors
  Containing 197 true matches and 464 true non-matches
    (29.80% true matches)
  Identified 611 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   577  (94.44%)
          2 :    31  (5.07%)
          3 :     2  (0.33%)
         16 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 611 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 443

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 660
  Number of unique weight vectors: 611

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (611, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 611 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 611 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 528 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 148 matches and 380 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (380, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 380 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 380 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.462, 0.609, 0.684, 0.308, 0.545] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 2 matches and 69 non-matches
    Purity of oracle classification:  0.972
    Entropy of oracle classification: 0.185
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

62.0
Analisando o arquivo: diverg(20)86_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 86), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)86_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)797_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 797), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)797_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 586
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 586 weight vectors
  Containing 196 true matches and 390 true non-matches
    (33.45% true matches)
  Identified 562 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   549  (97.69%)
          2 :    10  (1.78%)
          3 :     2  (0.36%)
         11 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 562 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 389

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 585
  Number of unique weight vectors: 562

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (562, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 562 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 562 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 480 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 136 matches and 344 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (344, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 136 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)600_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 600), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)600_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 537
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 537 weight vectors
  Containing 224 true matches and 313 true non-matches
    (41.71% true matches)
  Identified 498 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   479  (96.18%)
          2 :    16  (3.21%)
          3 :     2  (0.40%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 498 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 310

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 536
  Number of unique weight vectors: 498

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (498, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 498 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 498 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 33 matches and 47 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.978
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 418 weight vectors
  Based on 33 matches and 47 non-matches
  Classified 151 matches and 267 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.5875, 0.9777945702913884, 0.4125)
    (267, 0.5875, 0.9777945702913884, 0.4125)

Current size of match and non-match training data sets: 33 / 47

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 151 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.909, 1.000, 1.000, 1.000, 0.947] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 51 matches and 7 non-matches
    Purity of oracle classification:  0.879
    Entropy of oracle classification: 0.531
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)49_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 49), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)49_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 768
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 768 weight vectors
  Containing 216 true matches and 552 true non-matches
    (28.12% true matches)
  Identified 730 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   712  (97.53%)
          2 :    15  (2.05%)
          3 :     2  (0.27%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 730 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 549

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 767
  Number of unique weight vectors: 730

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (730, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 730 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 730 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 36 matches and 49 non-matches
    Purity of oracle classification:  0.576
    Entropy of oracle classification: 0.983
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 645 weight vectors
  Based on 36 matches and 49 non-matches
  Classified 276 matches and 369 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (276, 0.5764705882352941, 0.9830605548016025, 0.4235294117647059)
    (369, 0.5764705882352941, 0.9830605548016025, 0.4235294117647059)

Current size of match and non-match training data sets: 36 / 49

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 276 weight vectors
- Estimated match proportion 0.424

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 276 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.146, 0.130, 0.176, 0.318, 0.167] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.261, 0.174, 0.148, 0.186, 0.148] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.242, 0.121, 0.200, 0.171, 0.000] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 44 matches and 26 non-matches
    Purity of oracle classification:  0.629
    Entropy of oracle classification: 0.952
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)440_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 440), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)440_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 626
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 626 weight vectors
  Containing 188 true matches and 438 true non-matches
    (30.03% true matches)
  Identified 605 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   591  (97.69%)
          2 :    11  (1.82%)
          3 :     2  (0.33%)
          7 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 605 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 438

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 626
  Number of unique weight vectors: 605

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (605, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 605 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 605 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 32 matches and 51 non-matches
    Purity of oracle classification:  0.614
    Entropy of oracle classification: 0.962
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 522 weight vectors
  Based on 32 matches and 51 non-matches
  Classified 273 matches and 249 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (273, 0.6144578313253012, 0.9618624139909456, 0.3855421686746988)
    (249, 0.6144578313253012, 0.9618624139909456, 0.3855421686746988)

Current size of match and non-match training data sets: 32 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 249 weight vectors
- Estimated match proportion 0.386

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 249 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.846, 0.684, 0.529, 0.727, 0.700] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.579, 0.583, 0.522, 0.417, 0.563] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.923, 0.667, 0.667, 0.412, 0.571] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.714, 0.500, 0.412, 0.762] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.750, 0.905, 0.667, 0.500, 0.571] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.533, 0.667, 0.333, 0.714, 0.632] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.433, 0.737, 0.706, 0.500, 0.800] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 0 matches and 67 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(10)200_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 200), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)200_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 659
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 659 weight vectors
  Containing 211 true matches and 448 true non-matches
    (32.02% true matches)
  Identified 607 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   571  (94.07%)
          2 :    33  (5.44%)
          3 :     2  (0.33%)
         16 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 607 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 427

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 658
  Number of unique weight vectors: 607

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (607, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 607 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 607 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 524 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 179 matches and 345 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (345, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 179 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 179 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 44 matches and 14 non-matches
    Purity of oracle classification:  0.759
    Entropy of oracle classification: 0.797
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  14
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)802_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 802), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)802_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)495_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 495), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)495_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 801
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 801 weight vectors
  Containing 220 true matches and 581 true non-matches
    (27.47% true matches)
  Identified 763 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   745  (97.64%)
          2 :    15  (1.97%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 763 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 578

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 800
  Number of unique weight vectors: 763

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (763, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 763 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 763 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 678 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 135 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 135 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 135 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)881_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990099
recall                 0.334448
f-measure                   0.5
da                          101
dm                            0
ndm                           0
tp                          100
fp                            1
tn                  4.76529e+07
fn                          199
Name: (10, 1 - acm diverg, 881), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)881_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 983
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 983 weight vectors
  Containing 164 true matches and 819 true non-matches
    (16.68% true matches)
  Identified 944 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   915  (96.93%)
          2 :    26  (2.75%)
          3 :     2  (0.21%)
         10 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 944 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 145
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 798

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 982
  Number of unique weight vectors: 944

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (944, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 944 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 944 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 32 matches and 55 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 857 weight vectors
  Based on 32 matches and 55 non-matches
  Classified 286 matches and 571 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (286, 0.632183908045977, 0.9489804585630242, 0.367816091954023)
    (571, 0.632183908045977, 0.9489804585630242, 0.367816091954023)

Current size of match and non-match training data sets: 32 / 55

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 571 weight vectors
- Estimated match proportion 0.368

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 571 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.538, 0.789, 0.353, 0.545, 0.550] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.444, 0.643, 0.421, 0.200, 0.556] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.350, 0.455, 0.625, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.667, 0.444, 0.556, 0.222, 0.143] (False)
    [1.000, 0.000, 0.583, 0.389, 0.471, 0.545, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(10)369_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 369), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)369_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 797
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 797 weight vectors
  Containing 207 true matches and 590 true non-matches
    (25.97% true matches)
  Identified 750 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   715  (95.33%)
          2 :    32  (4.27%)
          3 :     2  (0.27%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 750 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 569

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 796
  Number of unique weight vectors: 750

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (750, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 750 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 750 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 24 matches and 61 non-matches
    Purity of oracle classification:  0.718
    Entropy of oracle classification: 0.859
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 665 weight vectors
  Based on 24 matches and 61 non-matches
  Classified 103 matches and 562 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)
    (562, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)

Current size of match and non-match training data sets: 24 / 61

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 103 weight vectors
- Estimated match proportion 0.282

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 103 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)984_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 984), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)984_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 146 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (538, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 538 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 538 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 9 matches and 65 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.534
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)694_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.980583
recall                 0.337793
f-measure              0.502488
da                          103
dm                            0
ndm                           0
tp                          101
fp                            2
tn                  4.76529e+07
fn                          198
Name: (10, 1 - acm diverg, 694), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)694_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 453
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 453 weight vectors
  Containing 146 true matches and 307 true non-matches
    (32.23% true matches)
  Identified 441 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   433  (98.19%)
          2 :     5  (1.13%)
          3 :     2  (0.45%)
          4 :     1  (0.23%)

Identified 0 non-pure unique weight vectors (from 441 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 134
     0.000 : 307

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 453
  Number of unique weight vectors: 441

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (441, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 441 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 441 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 28 matches and 51 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.938
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 362 weight vectors
  Based on 28 matches and 51 non-matches
  Classified 108 matches and 254 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (108, 0.6455696202531646, 0.9379626436434423, 0.35443037974683544)
    (254, 0.6455696202531646, 0.9379626436434423, 0.35443037974683544)

Current size of match and non-match training data sets: 28 / 51

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 108 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 108 vectors
  The selected farthest weight vectors are:
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 43 matches and 6 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.536
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(20)157_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 157), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)157_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 667
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 667 weight vectors
  Containing 217 true matches and 450 true non-matches
    (32.53% true matches)
  Identified 630 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   612  (97.14%)
          2 :    15  (2.38%)
          3 :     2  (0.32%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 630 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 447

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 666
  Number of unique weight vectors: 630

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (630, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 630 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 630 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 547 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 133 matches and 414 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (414, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 414 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 414 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 12 matches and 58 non-matches
    Purity of oracle classification:  0.829
    Entropy of oracle classification: 0.661
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)493_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 493), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)493_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 845
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 845 weight vectors
  Containing 227 true matches and 618 true non-matches
    (26.86% true matches)
  Identified 788 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   751  (95.30%)
          2 :    34  (4.31%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 788 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 597

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 844
  Number of unique weight vectors: 788

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (788, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 788 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 788 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 703 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 162 matches and 541 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (541, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 162 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)632_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 632), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)632_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)896_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 896), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)896_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 831
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 831 weight vectors
  Containing 227 true matches and 604 true non-matches
    (27.32% true matches)
  Identified 774 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   737  (95.22%)
          2 :    34  (4.39%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 774 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 583

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 830
  Number of unique weight vectors: 774

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (774, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 774 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 774 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 689 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 151 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (538, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 151 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 51 matches and 3 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)875_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 875), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)875_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)931_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (15, 1 - acm diverg, 931), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)931_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 969
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 969 weight vectors
  Containing 143 true matches and 826 true non-matches
    (14.76% true matches)
  Identified 935 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   906  (96.90%)
          2 :    26  (2.78%)
          3 :     2  (0.21%)
          5 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 935 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 129
     0.000 : 806

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 969
  Number of unique weight vectors: 935

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (935, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 935 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 935 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 848 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 91 matches and 757 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (757, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 91 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 91 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 40 matches and 3 non-matches
    Purity of oracle classification:  0.930
    Entropy of oracle classification: 0.365
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(15)52_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 52), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)52_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 544
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 544 weight vectors
  Containing 209 true matches and 335 true non-matches
    (38.42% true matches)
  Identified 513 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   498  (97.08%)
          2 :    12  (2.34%)
          3 :     2  (0.39%)
         16 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 513 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 334

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 543
  Number of unique weight vectors: 513

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (513, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 513 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 513 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 30 matches and 51 non-matches
    Purity of oracle classification:  0.630
    Entropy of oracle classification: 0.951
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 432 weight vectors
  Based on 30 matches and 51 non-matches
  Classified 151 matches and 281 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6296296296296297, 0.9509560484549725, 0.37037037037037035)
    (281, 0.6296296296296297, 0.9509560484549725, 0.37037037037037035)

Current size of match and non-match training data sets: 30 / 51

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 151 weight vectors
- Estimated match proportion 0.370

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.933, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)340_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976923
recall                 0.424749
f-measure              0.592075
da                          130
dm                            0
ndm                           0
tp                          127
fp                            3
tn                  4.76529e+07
fn                          172
Name: (10, 1 - acm diverg, 340), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)340_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 934
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 934 weight vectors
  Containing 137 true matches and 797 true non-matches
    (14.67% true matches)
  Identified 900 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   871  (96.78%)
          2 :    26  (2.89%)
          3 :     2  (0.22%)
          5 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 900 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 123
     0.000 : 777

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 934
  Number of unique weight vectors: 900

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (900, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 900 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 900 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 30 matches and 56 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 814 weight vectors
  Based on 30 matches and 56 non-matches
  Classified 236 matches and 578 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (236, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)
    (578, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)

Current size of match and non-match training data sets: 30 / 56

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 236 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 236 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.971, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.817, 1.000, 0.194, 0.091, 0.163, 0.222, 0.200] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.600, 0.944, 0.250, 0.200, 0.186, 0.136, 0.118] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 39 matches and 25 non-matches
    Purity of oracle classification:  0.609
    Entropy of oracle classification: 0.965
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  25
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

130.0
Analisando o arquivo: diverg(15)926_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 926), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)926_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 802
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 802 weight vectors
  Containing 226 true matches and 576 true non-matches
    (28.18% true matches)
  Identified 745 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   708  (95.03%)
          2 :    34  (4.56%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 745 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 555

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 801
  Number of unique weight vectors: 745

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (745, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 745 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 745 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 32 matches and 53 non-matches
    Purity of oracle classification:  0.624
    Entropy of oracle classification: 0.956
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 660 weight vectors
  Based on 32 matches and 53 non-matches
  Classified 331 matches and 329 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (331, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)
    (329, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)

Current size of match and non-match training data sets: 32 / 53

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 329 weight vectors
- Estimated match proportion 0.376

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 329 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.556, 0.348, 0.467, 0.636, 0.412] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.269, 0.478, 0.750, 0.385, 0.455] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.571, 0.857, 0.583, 0.667, 0.889] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)373_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 373), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)373_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 893
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 893 weight vectors
  Containing 198 true matches and 695 true non-matches
    (22.17% true matches)
  Identified 848 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   814  (95.99%)
          2 :    31  (3.66%)
          3 :     2  (0.24%)
         11 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 848 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 674

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 892
  Number of unique weight vectors: 848

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (848, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 848 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 848 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 27 matches and 59 non-matches
    Purity of oracle classification:  0.686
    Entropy of oracle classification: 0.898
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 762 weight vectors
  Based on 27 matches and 59 non-matches
  Classified 194 matches and 568 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (194, 0.686046511627907, 0.8976844934141643, 0.313953488372093)
    (568, 0.686046511627907, 0.8976844934141643, 0.313953488372093)

Current size of match and non-match training data sets: 27 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 194 weight vectors
- Estimated match proportion 0.314

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 194 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.530, 1.000, 0.159, 0.086, 0.182, 0.159, 0.163] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 41 matches and 17 non-matches
    Purity of oracle classification:  0.707
    Entropy of oracle classification: 0.873
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  17
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)51_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 51), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)51_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 29 matches and 59 non-matches
    Purity of oracle classification:  0.670
    Entropy of oracle classification: 0.914
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 29 matches and 59 non-matches
  Classified 162 matches and 777 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)
    (777, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)

Current size of match and non-match training data sets: 29 / 59

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 162 weight vectors
- Estimated match proportion 0.330

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)393_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 393), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)393_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 795
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 795 weight vectors
  Containing 209 true matches and 586 true non-matches
    (26.29% true matches)
  Identified 748 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   713  (95.32%)
          2 :    32  (4.28%)
          3 :     2  (0.27%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 748 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 565

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 794
  Number of unique weight vectors: 748

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (748, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 748 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 748 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 663 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 155 matches and 508 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (508, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 508 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 508 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.462, 0.609, 0.684, 0.308, 0.545] (False)
    [0.817, 1.000, 0.250, 0.212, 0.256, 0.045, 0.250] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 2 matches and 72 non-matches
    Purity of oracle classification:  0.973
    Entropy of oracle classification: 0.179
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)906_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 906), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)906_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 801
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 801 weight vectors
  Containing 222 true matches and 579 true non-matches
    (27.72% true matches)
  Identified 747 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   710  (95.05%)
          2 :    34  (4.55%)
          3 :     2  (0.27%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 747 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 558

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 800
  Number of unique weight vectors: 747

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (747, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 747 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 747 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 662 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 148 matches and 514 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (514, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 514 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 514 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 8 matches and 64 non-matches
    Purity of oracle classification:  0.889
    Entropy of oracle classification: 0.503
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)109_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 109), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)109_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 392
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 392 weight vectors
  Containing 218 true matches and 174 true non-matches
    (55.61% true matches)
  Identified 359 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   343  (95.54%)
          2 :    13  (3.62%)
          3 :     2  (0.56%)
         17 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 359 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 173

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 391
  Number of unique weight vectors: 359

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (359, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 359 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 359 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 43 matches and 33 non-matches
    Purity of oracle classification:  0.566
    Entropy of oracle classification: 0.987
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  33
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 283 weight vectors
  Based on 43 matches and 33 non-matches
  Classified 283 matches and 0 non-matches

42.0
Analisando o arquivo: diverg(10)731_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 731), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)731_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 623
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 623 weight vectors
  Containing 194 true matches and 429 true non-matches
    (31.14% true matches)
  Identified 574 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   540  (94.08%)
          2 :    31  (5.40%)
          3 :     2  (0.35%)
         15 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 574 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 408

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 622
  Number of unique weight vectors: 574

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (574, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 574 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 574 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 30 matches and 52 non-matches
    Purity of oracle classification:  0.634
    Entropy of oracle classification: 0.947
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 492 weight vectors
  Based on 30 matches and 52 non-matches
  Classified 155 matches and 337 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6341463414634146, 0.9474351361840306, 0.36585365853658536)
    (337, 0.6341463414634146, 0.9474351361840306, 0.36585365853658536)

Current size of match and non-match training data sets: 30 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 337 weight vectors
- Estimated match proportion 0.366

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 337 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.538, 0.500, 0.818, 0.789, 0.750] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [1.000, 0.000, 0.750, 0.778, 0.471, 0.727, 0.684] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.833, 0.571, 0.727, 0.647, 0.857] (False)
    [1.000, 0.000, 0.857, 0.286, 0.500, 0.643, 0.600] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [0.800, 0.000, 0.625, 0.571, 0.467, 0.474, 0.667] (False)
    [1.000, 0.000, 0.423, 0.478, 0.500, 0.813, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.583, 0.389, 0.471, 0.545, 0.474] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.385, 0.391, 0.667, 0.579, 0.824] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.000, 0.700, 0.818, 0.444, 0.619] (False)
    [1.000, 0.000, 0.857, 0.444, 0.556, 0.235, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [1.000, 0.000, 0.333, 0.750, 0.667, 0.667, 0.571] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.067, 0.550, 0.818, 0.727, 0.762] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 0 matches and 70 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(10)651_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 651), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)651_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 487
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 487 weight vectors
  Containing 222 true matches and 265 true non-matches
    (45.59% true matches)
  Identified 451 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   432  (95.79%)
          2 :    16  (3.55%)
          3 :     2  (0.44%)
         17 :     1  (0.22%)

Identified 1 non-pure unique weight vectors (from 451 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 262

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 486
  Number of unique weight vectors: 451

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (451, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 451 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 451 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 36 matches and 43 non-matches
    Purity of oracle classification:  0.544
    Entropy of oracle classification: 0.994
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 372 weight vectors
  Based on 36 matches and 43 non-matches
  Classified 148 matches and 224 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.5443037974683544, 0.9943290455933882, 0.45569620253164556)
    (224, 0.5443037974683544, 0.9943290455933882, 0.45569620253164556)

Current size of match and non-match training data sets: 36 / 43

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 0.99
- Size 224 weight vectors
- Estimated match proportion 0.456

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 224 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 4 matches and 63 non-matches
    Purity of oracle classification:  0.940
    Entropy of oracle classification: 0.326
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)100_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 100), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)100_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 886
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 886 weight vectors
  Containing 175 true matches and 711 true non-matches
    (19.75% true matches)
  Identified 847 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   817  (96.46%)
          2 :    27  (3.19%)
          3 :     2  (0.24%)
          9 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 847 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 156
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 690

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 877
  Number of unique weight vectors: 846

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (846, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 846 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 846 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 760 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 78 matches and 682 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (78, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (682, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 78 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 39

Farthest first selection of 39 weight vectors from 78 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.420, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)

Perform oracle with 100.00 accuracy on 39 weight vectors
  The oracle will correctly classify 39 weight vectors and wrongly classify 0
  Classified 38 matches and 1 non-matches
    Purity of oracle classification:  0.974
    Entropy of oracle classification: 0.172
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 39 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(10)79_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 79), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)79_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 313
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 313 weight vectors
  Containing 196 true matches and 117 true non-matches
    (62.62% true matches)
  Identified 289 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   276  (95.50%)
          2 :    10  (3.46%)
          3 :     2  (0.69%)
         11 :     1  (0.35%)

Identified 1 non-pure unique weight vectors (from 289 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 116

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 312
  Number of unique weight vectors: 289

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (289, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 289 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 72

Perform initial selection using "far" method

Farthest first selection of 72 weight vectors from 289 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 36 matches and 36 non-matches
    Purity of oracle classification:  0.500
    Entropy of oracle classification: 1.000
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  36
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 217 weight vectors
  Based on 36 matches and 36 non-matches
  Classified 143 matches and 74 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 72
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.5, 1.0, 0.5)
    (74, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 36 / 36

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 74 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 74 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.800, 1.000, 0.167, 0.180, 0.151, 0.147, 0.203] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 4 matches and 38 non-matches
    Purity of oracle classification:  0.905
    Entropy of oracle classification: 0.454
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)647_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 647), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)647_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)585_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 585), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)585_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 537
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 537 weight vectors
  Containing 209 true matches and 328 true non-matches
    (38.92% true matches)
  Identified 506 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   491  (97.04%)
          2 :    12  (2.37%)
          3 :     2  (0.40%)
         16 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 506 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 327

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 536
  Number of unique weight vectors: 506

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (506, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 506 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 506 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 31 matches and 50 non-matches
    Purity of oracle classification:  0.617
    Entropy of oracle classification: 0.960
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 425 weight vectors
  Based on 31 matches and 50 non-matches
  Classified 150 matches and 275 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)
    (275, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)

Current size of match and non-match training data sets: 31 / 50

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 275 weight vectors
- Estimated match proportion 0.383

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 275 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [1.000, 0.000, 0.864, 0.667, 0.435, 0.700, 0.600] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.846, 0.857, 0.353, 0.318, 0.400] (False)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.762, 0.714, 0.500, 0.400] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 4 matches and 64 non-matches
    Purity of oracle classification:  0.941
    Entropy of oracle classification: 0.323
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)947_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 947), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)947_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 543 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 543 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 12 matches and 61 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.645
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)342_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 342), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)342_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 408
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 408 weight vectors
  Containing 176 true matches and 232 true non-matches
    (43.14% true matches)
  Identified 385 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   368  (95.58%)
          2 :    14  (3.64%)
          3 :     2  (0.52%)
          6 :     1  (0.26%)

Identified 0 non-pure unique weight vectors (from 385 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 155
     0.000 : 230

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 408
  Number of unique weight vectors: 385

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (385, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 385 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 385 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 38 matches and 39 non-matches
    Purity of oracle classification:  0.506
    Entropy of oracle classification: 1.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 308 weight vectors
  Based on 38 matches and 39 non-matches
  Classified 251 matches and 57 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (251, 0.5064935064935064, 0.9998783322990061, 0.4935064935064935)
    (57, 0.5064935064935064, 0.9998783322990061, 0.4935064935064935)

Current size of match and non-match training data sets: 38 / 39

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 57 weight vectors
- Estimated match proportion 0.494

Sample size for this cluster: 36

Farthest first selection of 36 weight vectors from 57 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.818, 0.727, 0.438, 0.375, 0.400] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)

Perform oracle with 100.00 accuracy on 36 weight vectors
  The oracle will correctly classify 36 weight vectors and wrongly classify 0
  Classified 0 matches and 36 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  36
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 36 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(20)840_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 840), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)840_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)459_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 459), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)459_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 813
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 813 weight vectors
  Containing 209 true matches and 604 true non-matches
    (25.71% true matches)
  Identified 766 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   731  (95.43%)
          2 :    32  (4.18%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 766 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 583

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 812
  Number of unique weight vectors: 766

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (766, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 766 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 766 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 681 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 126 matches and 555 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (126, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (555, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 126 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 126 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 48 matches and 2 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)133_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.980198
recall                 0.331104
f-measure                 0.495
da                          101
dm                            0
ndm                           0
tp                           99
fp                            2
tn                  4.76529e+07
fn                          200
Name: (10, 1 - acm diverg, 133), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)133_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 159
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 159 weight vectors
  Containing 136 true matches and 23 true non-matches
    (85.53% true matches)
  Identified 148 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   141  (95.27%)
          2 :     4  (2.70%)
          3 :     2  (1.35%)
          4 :     1  (0.68%)

Identified 0 non-pure unique weight vectors (from 148 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 125
     0.000 : 23

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 159
  Number of unique weight vectors: 148

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 148 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 58

Perform initial selection using "far" method

Farthest first selection of 58 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 39 matches and 19 non-matches
    Purity of oracle classification:  0.672
    Entropy of oracle classification: 0.912
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  19
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 90 weight vectors
  Based on 39 matches and 19 non-matches
  Classified 90 matches and 0 non-matches

101.0
Analisando o arquivo: diverg(15)870_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 870), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)870_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 678
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 678 weight vectors
  Containing 215 true matches and 463 true non-matches
    (31.71% true matches)
  Identified 626 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   590  (94.25%)
          2 :    33  (5.27%)
          3 :     2  (0.32%)
         16 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 626 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 442

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 677
  Number of unique weight vectors: 626

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (626, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 626 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 626 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 543 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 134 matches and 409 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (134, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (409, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 409 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 409 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 14 matches and 57 non-matches
    Purity of oracle classification:  0.803
    Entropy of oracle classification: 0.716
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)442_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 442), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)442_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1021
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1021 weight vectors
  Containing 207 true matches and 814 true non-matches
    (20.27% true matches)
  Identified 965 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   929  (96.27%)
          2 :    33  (3.42%)
          3 :     2  (0.21%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 965 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 793

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1020
  Number of unique weight vectors: 965

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (965, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 965 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 965 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 878 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 311 matches and 567 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (311, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (567, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 567 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 567 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 0 matches and 73 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)836_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 836), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)836_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 663
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 663 weight vectors
  Containing 212 true matches and 451 true non-matches
    (31.98% true matches)
  Identified 608 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   572  (94.08%)
          2 :    33  (5.43%)
          3 :     2  (0.33%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 608 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 430

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 662
  Number of unique weight vectors: 608

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (608, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 608 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 608 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 26 matches and 57 non-matches
    Purity of oracle classification:  0.687
    Entropy of oracle classification: 0.897
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 525 weight vectors
  Based on 26 matches and 57 non-matches
  Classified 200 matches and 325 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (200, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)
    (325, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)

Current size of match and non-match training data sets: 26 / 57

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 200 weight vectors
- Estimated match proportion 0.313

Sample size for this cluster: 59

Farthest first selection of 59 weight vectors from 200 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.530, 1.000, 0.159, 0.086, 0.182, 0.159, 0.163] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 59 weight vectors
  The oracle will correctly classify 59 weight vectors and wrongly classify 0
  Classified 41 matches and 18 non-matches
    Purity of oracle classification:  0.695
    Entropy of oracle classification: 0.887
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  18
    Number of false non-matches: 0

Deleted 59 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)298_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 298), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)298_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 722
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 722 weight vectors
  Containing 219 true matches and 503 true non-matches
    (30.33% true matches)
  Identified 686 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   667  (97.23%)
          2 :    16  (2.33%)
          3 :     2  (0.29%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 686 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 500

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 721
  Number of unique weight vectors: 686

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (686, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 686 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 686 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 35 matches and 49 non-matches
    Purity of oracle classification:  0.583
    Entropy of oracle classification: 0.980
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 602 weight vectors
  Based on 35 matches and 49 non-matches
  Classified 289 matches and 313 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (289, 0.5833333333333334, 0.9798687566511527, 0.4166666666666667)
    (313, 0.5833333333333334, 0.9798687566511527, 0.4166666666666667)

Current size of match and non-match training data sets: 35 / 49

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 313 weight vectors
- Estimated match proportion 0.417

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 313 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.700, 0.645, 0.316, 0.455, 0.714] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.667, 0.857, 0.353, 0.632, 0.550] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [0.667, 0.000, 0.800, 0.684, 0.667, 0.529, 0.609] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.950, 0.000, 0.619, 0.800, 0.478, 0.280, 0.625] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.611, 0.000, 0.800, 0.684, 0.500, 0.778, 0.609] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.533, 0.000, 0.577, 0.783, 0.429, 0.615, 0.478] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.600, 0.700, 0.600, 0.611, 0.706] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 0 matches and 72 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)549_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 549), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)549_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 742
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 742 weight vectors
  Containing 163 true matches and 579 true non-matches
    (21.97% true matches)
  Identified 721 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   710  (98.47%)
          2 :     8  (1.11%)
          3 :     2  (0.28%)
         10 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 721 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 144
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 576

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 741
  Number of unique weight vectors: 721

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (721, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 721 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 721 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 637 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 106 matches and 531 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (531, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 106 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 106 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 43 matches and 5 non-matches
    Purity of oracle classification:  0.896
    Entropy of oracle classification: 0.482
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)331_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 331), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)331_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 961
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 961 weight vectors
  Containing 165 true matches and 796 true non-matches
    (17.17% true matches)
  Identified 924 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   893  (96.65%)
          2 :    28  (3.03%)
          3 :     2  (0.22%)
          6 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 924 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.000 : 776

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 961
  Number of unique weight vectors: 924

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (924, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 924 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 924 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 837 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 262 matches and 575 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (262, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (575, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 262 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 262 vectors
  The selected farthest weight vectors are:
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 42 matches and 24 non-matches
    Purity of oracle classification:  0.636
    Entropy of oracle classification: 0.946
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  24
    Number of false non-matches: 0

Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(20)879_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (20, 1 - acm diverg, 879), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)879_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 908
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 908 weight vectors
  Containing 204 true matches and 704 true non-matches
    (22.47% true matches)
  Identified 859 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   825  (96.04%)
          2 :    31  (3.61%)
          3 :     2  (0.23%)
         15 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 859 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 683

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 907
  Number of unique weight vectors: 859

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (859, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 859 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 859 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 27 matches and 59 non-matches
    Purity of oracle classification:  0.686
    Entropy of oracle classification: 0.898
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 773 weight vectors
  Based on 27 matches and 59 non-matches
  Classified 76 matches and 697 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (76, 0.686046511627907, 0.8976844934141643, 0.313953488372093)
    (697, 0.686046511627907, 0.8976844934141643, 0.313953488372093)

Current size of match and non-match training data sets: 27 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 697 weight vectors
- Estimated match proportion 0.314

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 697 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 19 matches and 55 non-matches
    Purity of oracle classification:  0.743
    Entropy of oracle classification: 0.822
    Number of true matches:      19
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)751_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 751), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)751_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1077
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1077 weight vectors
  Containing 221 true matches and 856 true non-matches
    (20.52% true matches)
  Identified 1021 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   985  (96.47%)
          2 :    33  (3.23%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1021 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 835

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1076
  Number of unique weight vectors: 1021

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1021, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1021 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1021 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 934 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 170 matches and 764 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (170, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (764, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 170 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 170 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 43 matches and 15 non-matches
    Purity of oracle classification:  0.741
    Entropy of oracle classification: 0.825
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  15
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)551_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 551), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)551_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 156 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (800, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 800 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 800 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 4 matches and 71 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.300
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)834_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 834), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)834_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 781
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 781 weight vectors
  Containing 206 true matches and 575 true non-matches
    (26.38% true matches)
  Identified 752 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   735  (97.74%)
          2 :    14  (1.86%)
          3 :     2  (0.27%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 752 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 572

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 780
  Number of unique weight vectors: 752

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (752, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 752 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 752 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 667 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 141 matches and 526 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (526, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 526 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 526 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 7 matches and 67 non-matches
    Purity of oracle classification:  0.905
    Entropy of oracle classification: 0.452
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)362_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (15, 1 - acm diverg, 362), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)362_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 745
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 745 weight vectors
  Containing 166 true matches and 579 true non-matches
    (22.28% true matches)
  Identified 724 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   713  (98.48%)
          2 :     8  (1.10%)
          3 :     2  (0.28%)
         10 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 724 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 147
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 576

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 744
  Number of unique weight vectors: 724

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (724, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 724 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 724 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 639 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 109 matches and 530 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (530, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 109 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 45 matches and 4 non-matches
    Purity of oracle classification:  0.918
    Entropy of oracle classification: 0.408
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(10)678_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 678), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)678_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 222
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 222 weight vectors
  Containing 183 true matches and 39 true non-matches
    (82.43% true matches)
  Identified 197 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   186  (94.42%)
          2 :     8  (4.06%)
          3 :     2  (1.02%)
         14 :     1  (0.51%)

Identified 1 non-pure unique weight vectors (from 197 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 158
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 38

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 221
  Number of unique weight vectors: 197

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (197, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 197 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 65

Perform initial selection using "far" method

Farthest first selection of 65 weight vectors from 197 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 39 matches and 26 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 132 weight vectors
  Based on 39 matches and 26 non-matches
  Classified 132 matches and 0 non-matches

65.0
Analisando o arquivo: diverg(20)509_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 509), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)509_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 141 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)942_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 942), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)942_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)278_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 278), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)278_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 131 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)321_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 321), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)321_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 649
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 649 weight vectors
  Containing 199 true matches and 450 true non-matches
    (30.66% true matches)
  Identified 622 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   606  (97.43%)
          2 :    13  (2.09%)
          3 :     2  (0.32%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 622 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 447

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 648
  Number of unique weight vectors: 622

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (622, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 622 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 622 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 539 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 127 matches and 412 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (127, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (412, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 412 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 412 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 12 matches and 58 non-matches
    Purity of oracle classification:  0.829
    Entropy of oracle classification: 0.661
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)584_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 584), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)584_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 555
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 555 weight vectors
  Containing 173 true matches and 382 true non-matches
    (31.17% true matches)
  Identified 537 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   528  (98.32%)
          2 :     6  (1.12%)
          3 :     2  (0.37%)
          9 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 537 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 155
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 381

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 546
  Number of unique weight vectors: 536

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (536, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 536 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 536 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 27 matches and 54 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 455 weight vectors
  Based on 27 matches and 54 non-matches
  Classified 114 matches and 341 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (341, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 27 / 54

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 114 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 114 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 47 matches and 2 non-matches
    Purity of oracle classification:  0.959
    Entropy of oracle classification: 0.246
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(20)1_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 1), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)1_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 118 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 118 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)321_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 321), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)321_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 456
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 456 weight vectors
  Containing 215 true matches and 241 true non-matches
    (47.15% true matches)
  Identified 421 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   406  (96.44%)
          2 :    12  (2.85%)
          3 :     2  (0.48%)
         20 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 421 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 240

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 455
  Number of unique weight vectors: 421

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (421, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 421 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 421 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 34 matches and 44 non-matches
    Purity of oracle classification:  0.564
    Entropy of oracle classification: 0.988
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 343 weight vectors
  Based on 34 matches and 44 non-matches
  Classified 141 matches and 202 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.5641025641025641, 0.9881108365218301, 0.4358974358974359)
    (202, 0.5641025641025641, 0.9881108365218301, 0.4358974358974359)

Current size of match and non-match training data sets: 34 / 44

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 202 weight vectors
- Estimated match proportion 0.436

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 202 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.929, 1.000, 0.182, 0.238, 0.188, 0.146, 0.270] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 7 matches and 57 non-matches
    Purity of oracle classification:  0.891
    Entropy of oracle classification: 0.498
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)140_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 140), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)140_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)511_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 511), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)511_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 123 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)289_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 289), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)289_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 541
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 541 weight vectors
  Containing 220 true matches and 321 true non-matches
    (40.67% true matches)
  Identified 503 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   485  (96.42%)
          2 :    15  (2.98%)
          3 :     2  (0.40%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 503 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 318

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 540
  Number of unique weight vectors: 503

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (503, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 503 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 503 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 32 matches and 48 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 423 weight vectors
  Based on 32 matches and 48 non-matches
  Classified 142 matches and 281 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6, 0.9709505944546686, 0.4)
    (281, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 32 / 48

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 142 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 53 matches and 3 non-matches
    Purity of oracle classification:  0.946
    Entropy of oracle classification: 0.301
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)2_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 2), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)2_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 597
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 597 weight vectors
  Containing 201 true matches and 396 true non-matches
    (33.67% true matches)
  Identified 566 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   550  (97.17%)
          2 :    13  (2.30%)
          3 :     2  (0.35%)
         15 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 566 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 393

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 596
  Number of unique weight vectors: 566

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (566, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 566 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 31 matches and 51 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 484 weight vectors
  Based on 31 matches and 51 non-matches
  Classified 144 matches and 340 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)
    (340, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)

Current size of match and non-match training data sets: 31 / 51

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 340 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 340 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.500, 0.826, 0.429, 0.538, 0.636] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 4 matches and 67 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.313
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)454_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 454), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)454_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 943
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 943 weight vectors
  Containing 199 true matches and 744 true non-matches
    (21.10% true matches)
  Identified 898 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   864  (96.21%)
          2 :    31  (3.45%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 898 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 723

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 942
  Number of unique weight vectors: 898

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (898, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 898 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 898 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 812 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 123 matches and 689 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (689, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 689 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 689 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 13 matches and 58 non-matches
    Purity of oracle classification:  0.817
    Entropy of oracle classification: 0.687
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)166_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 166), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)166_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 829
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 829 weight vectors
  Containing 227 true matches and 602 true non-matches
    (27.38% true matches)
  Identified 772 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   735  (95.21%)
          2 :    34  (4.40%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 772 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 581

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 828
  Number of unique weight vectors: 772

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (772, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 772 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 772 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 687 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (537, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 537 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 537 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 9 matches and 64 non-matches
    Purity of oracle classification:  0.877
    Entropy of oracle classification: 0.539
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)443_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (10, 1 - acm diverg, 443), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)443_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 661
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 661 weight vectors
  Containing 140 true matches and 521 true non-matches
    (21.18% true matches)
  Identified 645 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   634  (98.29%)
          2 :     8  (1.24%)
          3 :     2  (0.31%)
          5 :     1  (0.16%)

Identified 0 non-pure unique weight vectors (from 645 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 126
     0.000 : 519

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 661
  Number of unique weight vectors: 645

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (645, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 645 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 645 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 562 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 90 matches and 472 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (90, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (472, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 90 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 90 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 40 matches and 5 non-matches
    Purity of oracle classification:  0.889
    Entropy of oracle classification: 0.503
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(20)443_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 443), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)443_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1027
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1027 weight vectors
  Containing 223 true matches and 804 true non-matches
    (21.71% true matches)
  Identified 973 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   936  (96.20%)
          2 :    34  (3.49%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 973 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 783

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 973

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (973, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 973 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 973 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 886 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 131 matches and 755 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (755, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 755 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 755 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)285_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 285), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)285_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 460
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 460 weight vectors
  Containing 210 true matches and 250 true non-matches
    (45.65% true matches)
  Identified 426 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   409  (96.01%)
          2 :    14  (3.29%)
          3 :     2  (0.47%)
         17 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 426 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 247

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 459
  Number of unique weight vectors: 426

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (426, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 426 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 426 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 37 matches and 41 non-matches
    Purity of oracle classification:  0.526
    Entropy of oracle classification: 0.998
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 348 weight vectors
  Based on 37 matches and 41 non-matches
  Classified 246 matches and 102 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (246, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)
    (102, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)

Current size of match and non-match training data sets: 37 / 41

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 102 weight vectors
- Estimated match proportion 0.474

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 102 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.635, 1.000, 0.179, 0.265, 0.167, 0.121, 0.241] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.800, 1.000, 0.111, 0.200, 0.100, 0.194, 0.094] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 0 matches and 50 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)853_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 853), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)853_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 432
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 432 weight vectors
  Containing 194 true matches and 238 true non-matches
    (44.91% true matches)
  Identified 408 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   391  (95.83%)
          2 :    14  (3.43%)
          3 :     2  (0.49%)
          7 :     1  (0.25%)

Identified 0 non-pure unique weight vectors (from 408 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.000 : 236

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 432
  Number of unique weight vectors: 408

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (408, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 408 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 408 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 38 matches and 40 non-matches
    Purity of oracle classification:  0.513
    Entropy of oracle classification: 1.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  40
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 330 weight vectors
  Based on 38 matches and 40 non-matches
  Classified 269 matches and 61 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (269, 0.5128205128205128, 0.9995256892936493, 0.48717948717948717)
    (61, 0.5128205128205128, 0.9995256892936493, 0.48717948717948717)

Current size of match and non-match training data sets: 38 / 40

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 269 weight vectors
- Estimated match proportion 0.487

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 269 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.913, 1.000, 0.184, 0.175, 0.087, 0.233, 0.167] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 43 matches and 28 non-matches
    Purity of oracle classification:  0.606
    Entropy of oracle classification: 0.968
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)111_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 111), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)111_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 0 matches and 829 non-matches

40.0
Analisando o arquivo: diverg(20)745_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 745), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)745_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 855
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 855 weight vectors
  Containing 221 true matches and 634 true non-matches
    (25.85% true matches)
  Identified 799 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   763  (95.49%)
          2 :    33  (4.13%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 799 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 854
  Number of unique weight vectors: 799

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (799, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 799 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 799 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 714 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (564, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 150 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)768_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 768), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)768_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 668
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 668 weight vectors
  Containing 207 true matches and 461 true non-matches
    (30.99% true matches)
  Identified 637 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   623  (97.80%)
          2 :    11  (1.73%)
          3 :     2  (0.31%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 637 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 460

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 667
  Number of unique weight vectors: 637

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (637, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 637 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 637 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 554 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 140 matches and 414 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (414, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 140 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 50 matches and 4 non-matches
    Purity of oracle classification:  0.926
    Entropy of oracle classification: 0.381
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)912_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 912), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)912_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 323
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 323 weight vectors
  Containing 207 true matches and 116 true non-matches
    (64.09% true matches)
  Identified 291 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   277  (95.19%)
          2 :    11  (3.78%)
          3 :     2  (0.69%)
         18 :     1  (0.34%)

Identified 1 non-pure unique weight vectors (from 291 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 115

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 322
  Number of unique weight vectors: 291

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (291, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 291 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 72

Perform initial selection using "far" method

Farthest first selection of 72 weight vectors from 291 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 33 matches and 39 non-matches
    Purity of oracle classification:  0.542
    Entropy of oracle classification: 0.995
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 219 weight vectors
  Based on 33 matches and 39 non-matches
  Classified 146 matches and 73 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 72
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.5416666666666666, 0.9949848281859701, 0.4583333333333333)
    (73, 0.5416666666666666, 0.9949848281859701, 0.4583333333333333)

Current size of match and non-match training data sets: 33 / 39

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 0.99
- Size 146 weight vectors
- Estimated match proportion 0.458

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 50 matches and 8 non-matches
    Purity of oracle classification:  0.862
    Entropy of oracle classification: 0.579
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)265_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 265), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)265_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 902
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 902 weight vectors
  Containing 214 true matches and 688 true non-matches
    (23.73% true matches)
  Identified 850 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   814  (95.76%)
          2 :    33  (3.88%)
          3 :     2  (0.24%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 850 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 667

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 901
  Number of unique weight vectors: 850

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (850, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 850 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 850 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 764 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 181 matches and 583 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (181, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (583, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 583 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 583 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.538, 0.789, 0.353, 0.545, 0.550] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.857, 0.417, 0.750, 0.500, 0.455] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)96_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 96), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)96_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1027
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1027 weight vectors
  Containing 223 true matches and 804 true non-matches
    (21.71% true matches)
  Identified 973 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   936  (96.20%)
          2 :    34  (3.49%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 973 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 783

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 973

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (973, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 973 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 973 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 886 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 131 matches and 755 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (755, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 131 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 49 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.141
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)525_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978723
recall                 0.461538
f-measure              0.627273
da                          141
dm                            0
ndm                           0
tp                          138
fp                            3
tn                  4.76529e+07
fn                          161
Name: (10, 1 - acm diverg, 525), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)525_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 802
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 802 weight vectors
  Containing 118 true matches and 684 true non-matches
    (14.71% true matches)
  Identified 772 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   745  (96.50%)
          2 :    24  (3.11%)
          3 :     3  (0.39%)

Identified 0 non-pure unique weight vectors (from 772 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 108
     0.000 : 664

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 802
  Number of unique weight vectors: 772

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (772, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 772 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 772 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 24 matches and 61 non-matches
    Purity of oracle classification:  0.718
    Entropy of oracle classification: 0.859
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 687 weight vectors
  Based on 24 matches and 61 non-matches
  Classified 82 matches and 605 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (82, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)
    (605, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)

Current size of match and non-match training data sets: 24 / 61

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 605 weight vectors
- Estimated match proportion 0.282

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 605 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 2 matches and 67 non-matches
    Purity of oracle classification:  0.971
    Entropy of oracle classification: 0.189
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

141.0
Analisando o arquivo: diverg(20)550_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 550), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)550_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1068
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1068 weight vectors
  Containing 226 true matches and 842 true non-matches
    (21.16% true matches)
  Identified 1011 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   974  (96.34%)
          2 :    34  (3.36%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1011 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 821

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1067
  Number of unique weight vectors: 1011

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1011, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1011 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1011 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 924 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 131 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (793, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 131 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)1_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 1), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)1_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 380
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 380 weight vectors
  Containing 216 true matches and 164 true non-matches
    (56.84% true matches)
  Identified 347 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   331  (95.39%)
          2 :    13  (3.75%)
          3 :     2  (0.58%)
         17 :     1  (0.29%)

Identified 1 non-pure unique weight vectors (from 347 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 163

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 379
  Number of unique weight vectors: 347

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (347, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 347 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 347 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 46 matches and 29 non-matches
    Purity of oracle classification:  0.613
    Entropy of oracle classification: 0.963
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  29
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 272 weight vectors
  Based on 46 matches and 29 non-matches
  Classified 272 matches and 0 non-matches

42.0
Analisando o arquivo: diverg(15)422_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (15, 1 - acm diverg, 422), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)422_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1020
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1020 weight vectors
  Containing 167 true matches and 853 true non-matches
    (16.37% true matches)
  Identified 981 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   952  (97.04%)
          2 :    26  (2.65%)
          3 :     2  (0.20%)
         10 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 981 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 832

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1019
  Number of unique weight vectors: 981

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (981, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 981 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 981 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 894 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 196 matches and 698 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (196, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (698, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 698 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 698 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(20)744_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 744), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)744_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1100
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1100 weight vectors
  Containing 227 true matches and 873 true non-matches
    (20.64% true matches)
  Identified 1043 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1006  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1043 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1099
  Number of unique weight vectors: 1043

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1043, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1043 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1043 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 955 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)39_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 39), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)39_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 959
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 959 weight vectors
  Containing 217 true matches and 742 true non-matches
    (22.63% true matches)
  Identified 904 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   868  (96.02%)
          2 :    33  (3.65%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 904 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 721

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 958
  Number of unique weight vectors: 904

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (904, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 904 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 904 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 817 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 150 matches and 667 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (667, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 150 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)951_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 951), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)951_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)74_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 74), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)74_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1100
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1100 weight vectors
  Containing 227 true matches and 873 true non-matches
    (20.64% true matches)
  Identified 1043 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1006  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1043 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1099
  Number of unique weight vectors: 1043

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1043, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1043 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1043 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 955 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)83_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 83), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)83_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(20)170_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 170), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)170_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 156 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (800, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 800 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 800 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 4 matches and 71 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.300
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)413_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 413), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)413_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 289
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 289 weight vectors
  Containing 157 true matches and 132 true non-matches
    (54.33% true matches)
  Identified 273 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   261  (95.60%)
          2 :     9  (3.30%)
          3 :     2  (0.73%)
          4 :     1  (0.37%)

Identified 0 non-pure unique weight vectors (from 273 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 143
     0.000 : 130

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 289
  Number of unique weight vectors: 273

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (273, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 273 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Perform initial selection using "far" method

Farthest first selection of 71 weight vectors from 273 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 30 matches and 41 non-matches
    Purity of oracle classification:  0.577
    Entropy of oracle classification: 0.983
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 202 weight vectors
  Based on 30 matches and 41 non-matches
  Classified 115 matches and 87 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 71
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (115, 0.5774647887323944, 0.982615428552612, 0.4225352112676056)
    (87, 0.5774647887323944, 0.982615428552612, 0.4225352112676056)

Current size of match and non-match training data sets: 30 / 41

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 87 weight vectors
- Estimated match proportion 0.423

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 87 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 2 matches and 43 non-matches
    Purity of oracle classification:  0.956
    Entropy of oracle classification: 0.262
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(15)652_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 652), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)652_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 809
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 809 weight vectors
  Containing 223 true matches and 586 true non-matches
    (27.56% true matches)
  Identified 755 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   718  (95.10%)
          2 :    34  (4.50%)
          3 :     2  (0.26%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 755 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 565

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 808
  Number of unique weight vectors: 755

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (755, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 755 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 755 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 670 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 94 matches and 576 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (576, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 576 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 576 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 20 matches and 53 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      20
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)418_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 418), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)418_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)68_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 68), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)68_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 223 true matches and 585 true non-matches
    (27.60% true matches)
  Identified 754 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   717  (95.09%)
          2 :    34  (4.51%)
          3 :     2  (0.27%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 754 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 564

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 754

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (754, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 754 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 754 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 669 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 93 matches and 576 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (93, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (576, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 576 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 576 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 20 matches and 53 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      20
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)63_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 63), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)63_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 103 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 103 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 43 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)763_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 763), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)763_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 718
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 718 weight vectors
  Containing 203 true matches and 515 true non-matches
    (28.27% true matches)
  Identified 692 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   678  (97.98%)
          2 :    11  (1.59%)
          3 :     2  (0.29%)
         12 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 692 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 514

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 717
  Number of unique weight vectors: 692

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (692, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 692 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 692 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 608 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 114 matches and 494 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (494, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 114 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 114 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 47 matches and 2 non-matches
    Purity of oracle classification:  0.959
    Entropy of oracle classification: 0.246
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)762_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 762), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)762_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)673_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 673), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)673_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 754
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 754 weight vectors
  Containing 222 true matches and 532 true non-matches
    (29.44% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   699  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 529

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 753
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 135 matches and 499 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (499, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 499 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 499 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 13 matches and 60 non-matches
    Purity of oracle classification:  0.822
    Entropy of oracle classification: 0.676
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)783_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 783), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)783_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 708
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 708 weight vectors
  Containing 196 true matches and 512 true non-matches
    (27.68% true matches)
  Identified 684 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   667  (97.51%)
          2 :    14  (2.05%)
          3 :     2  (0.29%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 684 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 510

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 708
  Number of unique weight vectors: 684

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (684, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 684 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 684 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 32 matches and 52 non-matches
    Purity of oracle classification:  0.619
    Entropy of oracle classification: 0.959
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 600 weight vectors
  Based on 32 matches and 52 non-matches
  Classified 285 matches and 315 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (285, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)
    (315, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)

Current size of match and non-match training data sets: 32 / 52

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 315 weight vectors
- Estimated match proportion 0.381

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 315 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.667, 0.000, 0.800, 0.684, 0.667, 0.529, 0.609] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 0 matches and 70 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)29_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 29), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)29_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 793 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)852_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (10, 1 - acm diverg, 852), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)852_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 696
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 696 weight vectors
  Containing 143 true matches and 553 true non-matches
    (20.55% true matches)
  Identified 662 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   633  (95.62%)
          2 :    26  (3.93%)
          3 :     2  (0.30%)
          5 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 662 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 129
     0.000 : 533

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 696
  Number of unique weight vectors: 662

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (662, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 662 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 662 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 578 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 87 matches and 491 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (87, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (491, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 87 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 87 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 40 matches and 3 non-matches
    Purity of oracle classification:  0.930
    Entropy of oracle classification: 0.365
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(15)92_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979592
recall                  0.32107
f-measure              0.483627
da                           98
dm                            0
ndm                           0
tp                           96
fp                            2
tn                  4.76529e+07
fn                          203
Name: (15, 1 - acm diverg, 92), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)92_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 978
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 978 weight vectors
  Containing 169 true matches and 809 true non-matches
    (17.28% true matches)
  Identified 941 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   910  (96.71%)
          2 :    28  (2.98%)
          3 :     2  (0.21%)
          6 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 941 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.000 : 789

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 978
  Number of unique weight vectors: 941

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (941, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 941 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 941 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 854 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 204 matches and 650 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (204, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (650, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 650 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 650 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 0 matches and 73 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

98.0
Analisando o arquivo: diverg(10)920_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 920), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)920_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 736
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 736 weight vectors
  Containing 196 true matches and 540 true non-matches
    (26.63% true matches)
  Identified 694 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   659  (94.96%)
          2 :    32  (4.61%)
          3 :     2  (0.29%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 694 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 520

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 736
  Number of unique weight vectors: 694

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (694, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 694 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 694 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 25 matches and 59 non-matches
    Purity of oracle classification:  0.702
    Entropy of oracle classification: 0.878
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 610 weight vectors
  Based on 25 matches and 59 non-matches
  Classified 125 matches and 485 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (125, 0.7023809523809523, 0.8783609387702276, 0.2976190476190476)
    (485, 0.7023809523809523, 0.8783609387702276, 0.2976190476190476)

Current size of match and non-match training data sets: 25 / 59

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 485 weight vectors
- Estimated match proportion 0.298

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 485 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 12 matches and 57 non-matches
    Purity of oracle classification:  0.826
    Entropy of oracle classification: 0.667
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)735_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 735), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)735_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 123 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)398_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 398), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)398_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1092
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1092 weight vectors
  Containing 226 true matches and 866 true non-matches
    (20.70% true matches)
  Identified 1035 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   998  (96.43%)
          2 :    34  (3.29%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1035 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1091
  Number of unique weight vectors: 1035

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1035, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1035 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1035 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 27 matches and 61 non-matches
    Purity of oracle classification:  0.693
    Entropy of oracle classification: 0.889
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 947 weight vectors
  Based on 27 matches and 61 non-matches
  Classified 148 matches and 799 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6931818181818182, 0.8894663896628687, 0.3068181818181818)
    (799, 0.6931818181818182, 0.8894663896628687, 0.3068181818181818)

Current size of match and non-match training data sets: 27 / 61

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 799 weight vectors
- Estimated match proportion 0.307

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 799 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 9 matches and 65 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.534
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)889_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 889), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)889_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 754
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 754 weight vectors
  Containing 222 true matches and 532 true non-matches
    (29.44% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   699  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 529

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 753
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 135 matches and 499 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (499, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 499 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 499 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 13 matches and 60 non-matches
    Purity of oracle classification:  0.822
    Entropy of oracle classification: 0.676
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)34_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 34), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)34_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 817 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 817 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 11 matches and 60 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.622
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)888_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 888), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)888_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 103 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 103 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 43 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)289_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (15, 1 - acm diverg, 289), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)289_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 975
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 975 weight vectors
  Containing 166 true matches and 809 true non-matches
    (17.03% true matches)
  Identified 938 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   907  (96.70%)
          2 :    28  (2.99%)
          3 :     2  (0.21%)
          6 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 938 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 149
     0.000 : 789

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 975
  Number of unique weight vectors: 938

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (938, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 938 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 938 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 851 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 114 matches and 737 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (737, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 114 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 114 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 44 matches and 4 non-matches
    Purity of oracle classification:  0.917
    Entropy of oracle classification: 0.414
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(10)226_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 226), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)226_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 568
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 568 weight vectors
  Containing 201 true matches and 367 true non-matches
    (35.39% true matches)
  Identified 535 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   521  (97.38%)
          2 :    11  (2.06%)
          3 :     2  (0.37%)
         19 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 535 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 366

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 567
  Number of unique weight vectors: 535

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (535, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 535 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 535 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 25 matches and 56 non-matches
    Purity of oracle classification:  0.691
    Entropy of oracle classification: 0.892
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 454 weight vectors
  Based on 25 matches and 56 non-matches
  Classified 140 matches and 314 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.691358024691358, 0.8915996278279094, 0.30864197530864196)
    (314, 0.691358024691358, 0.8915996278279094, 0.30864197530864196)

Current size of match and non-match training data sets: 25 / 56

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 140 weight vectors
- Estimated match proportion 0.309

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)386_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 386), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)386_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 91 matches and 857 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (857, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 857 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 857 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 18 matches and 52 non-matches
    Purity of oracle classification:  0.743
    Entropy of oracle classification: 0.822
    Number of true matches:      18
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)107_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 107), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)107_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 222 true matches and 577 true non-matches
    (27.78% true matches)
  Identified 745 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   708  (95.03%)
          2 :    34  (4.56%)
          3 :     2  (0.27%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 745 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 556

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 745

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (745, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 745 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 745 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 660 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 147 matches and 513 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (513, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 513 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 513 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 9 matches and 63 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)227_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 227), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)227_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 541
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 541 weight vectors
  Containing 220 true matches and 321 true non-matches
    (40.67% true matches)
  Identified 503 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   485  (96.42%)
          2 :    15  (2.98%)
          3 :     2  (0.40%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 503 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 318

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 540
  Number of unique weight vectors: 503

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (503, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 503 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 503 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 32 matches and 48 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 423 weight vectors
  Based on 32 matches and 48 non-matches
  Classified 142 matches and 281 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6, 0.9709505944546686, 0.4)
    (281, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 32 / 48

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 281 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 281 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.800, 0.636, 0.563, 0.545, 0.722] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 8 matches and 61 non-matches
    Purity of oracle classification:  0.884
    Entropy of oracle classification: 0.518
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)626_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 626), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)626_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 908
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 908 weight vectors
  Containing 213 true matches and 695 true non-matches
    (23.46% true matches)
  Identified 853 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   817  (95.78%)
          2 :    33  (3.87%)
          3 :     2  (0.23%)
         19 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 853 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 674

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 907
  Number of unique weight vectors: 853

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (853, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 853 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 853 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 30 matches and 56 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 767 weight vectors
  Based on 30 matches and 56 non-matches
  Classified 199 matches and 568 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (199, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)
    (568, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)

Current size of match and non-match training data sets: 30 / 56

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 199 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 61

Farthest first selection of 61 weight vectors from 199 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 61 weight vectors
  The oracle will correctly classify 61 weight vectors and wrongly classify 0
  Classified 40 matches and 21 non-matches
    Purity of oracle classification:  0.656
    Entropy of oracle classification: 0.929
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  21
    Number of false non-matches: 0

Deleted 61 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)290_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 290), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)290_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1015
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1015 weight vectors
  Containing 213 true matches and 802 true non-matches
    (20.99% true matches)
  Identified 963 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   928  (96.37%)
          2 :    32  (3.32%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 963 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 781

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1014
  Number of unique weight vectors: 963

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (963, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 963 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 963 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 876 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 122 matches and 754 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (122, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (754, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 754 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 754 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)903_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 903), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)903_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 634
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 634 weight vectors
  Containing 189 true matches and 445 true non-matches
    (29.81% true matches)
  Identified 613 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   599  (97.72%)
          2 :    11  (1.79%)
          3 :     2  (0.33%)
          7 :     1  (0.16%)

Identified 0 non-pure unique weight vectors (from 613 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.000 : 445

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 634
  Number of unique weight vectors: 613

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (613, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 613 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 613 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.364, 0.619, 0.471, 0.600, 0.533] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 530 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 129 matches and 401 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (129, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (401, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 129 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 129 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)69_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (15, 1 - acm diverg, 69), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)69_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 872
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 872 weight vectors
  Containing 186 true matches and 686 true non-matches
    (21.33% true matches)
  Identified 832 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   798  (95.91%)
          2 :    31  (3.73%)
          3 :     2  (0.24%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 832 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.000 : 666

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 872
  Number of unique weight vectors: 832

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (832, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 832 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 832 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 746 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 148 matches and 598 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (598, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 148 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 45 matches and 9 non-matches
    Purity of oracle classification:  0.833
    Entropy of oracle classification: 0.650
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(10)359_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.980198
recall                 0.331104
f-measure                 0.495
da                          101
dm                            0
ndm                           0
tp                           99
fp                            2
tn                  4.76529e+07
fn                          200
Name: (10, 1 - acm diverg, 359), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)359_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 463
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 463 weight vectors
  Containing 149 true matches and 314 true non-matches
    (32.18% true matches)
  Identified 451 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   443  (98.23%)
          2 :     5  (1.11%)
          3 :     2  (0.44%)
          4 :     1  (0.22%)

Identified 0 non-pure unique weight vectors (from 451 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 137
     0.000 : 314

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 463
  Number of unique weight vectors: 451

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (451, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 451 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 451 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 28 matches and 51 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.938
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 372 weight vectors
  Based on 28 matches and 51 non-matches
  Classified 112 matches and 260 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.6455696202531646, 0.9379626436434423, 0.35443037974683544)
    (260, 0.6455696202531646, 0.9379626436434423, 0.35443037974683544)

Current size of match and non-match training data sets: 28 / 51

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 112 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 43 matches and 6 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.536
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(20)907_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 907), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)907_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)220_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 220), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)220_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 844
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 844 weight vectors
  Containing 209 true matches and 635 true non-matches
    (24.76% true matches)
  Identified 797 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   762  (95.61%)
          2 :    32  (4.02%)
          3 :     2  (0.25%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 797 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 843
  Number of unique weight vectors: 797

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (797, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 797 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 797 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 712 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 123 matches and 589 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (589, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 589 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 589 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 16 matches and 56 non-matches
    Purity of oracle classification:  0.778
    Entropy of oracle classification: 0.764
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)244_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 244), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)244_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 0 matches and 829 non-matches

40.0
Analisando o arquivo: diverg(15)235_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 235), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)235_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 724
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 724 weight vectors
  Containing 212 true matches and 512 true non-matches
    (29.28% true matches)
  Identified 671 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   636  (94.78%)
          2 :    32  (4.77%)
          3 :     2  (0.30%)
         18 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 671 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 491

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 723
  Number of unique weight vectors: 671

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (671, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 671 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 671 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 587 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 142 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (445, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 142 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)509_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 509), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)509_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1023
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1023 weight vectors
  Containing 222 true matches and 801 true non-matches
    (21.70% true matches)
  Identified 969 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   932  (96.18%)
          2 :    34  (3.51%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 969 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 780

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1022
  Number of unique weight vectors: 969

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (969, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 969 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 969 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 882 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 302 matches and 580 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (302, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (580, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 302 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 302 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.600, 1.000, 0.217, 0.132, 0.167, 0.125, 0.188] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 43 matches and 25 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  25
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)267_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 267), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)267_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 208 true matches and 867 true non-matches
    (19.35% true matches)
  Identified 1028 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   993  (96.60%)
          2 :    32  (3.11%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1028 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1028

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1028, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1028 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1028 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 940 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 123 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 123 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)752_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990099
recall                 0.334448
f-measure                   0.5
da                          101
dm                            0
ndm                           0
tp                          100
fp                            1
tn                  4.76529e+07
fn                          199
Name: (10, 1 - acm diverg, 752), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)752_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 779
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 779 weight vectors
  Containing 165 true matches and 614 true non-matches
    (21.18% true matches)
  Identified 740 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   711  (96.08%)
          2 :    26  (3.51%)
          3 :     2  (0.27%)
         10 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 740 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 146
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 593

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 778
  Number of unique weight vectors: 740

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (740, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 740 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 740 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 655 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 116 matches and 539 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (116, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (539, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 116 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 116 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 42 matches and 8 non-matches
    Purity of oracle classification:  0.840
    Entropy of oracle classification: 0.634
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(20)820_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 820), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)820_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)203_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 203), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)203_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1068
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1068 weight vectors
  Containing 226 true matches and 842 true non-matches
    (21.16% true matches)
  Identified 1011 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   974  (96.34%)
          2 :    34  (3.36%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1011 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 821

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1067
  Number of unique weight vectors: 1011

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1011, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1011 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1011 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 924 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 131 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (793, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 793 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 12 matches and 58 non-matches
    Purity of oracle classification:  0.829
    Entropy of oracle classification: 0.661
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)641_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.977273
recall                 0.431438
f-measure              0.598608
da                          132
dm                            0
ndm                           0
tp                          129
fp                            3
tn                  4.76529e+07
fn                          170
Name: (10, 1 - acm diverg, 641), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)641_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 771
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 771 weight vectors
  Containing 107 true matches and 664 true non-matches
    (13.88% true matches)
  Identified 740 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   712  (96.22%)
          2 :    25  (3.38%)
          3 :     3  (0.41%)

Identified 0 non-pure unique weight vectors (from 740 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 96
     0.000 : 644

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 771
  Number of unique weight vectors: 740

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (740, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 740 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 740 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 655 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 86 matches and 569 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (86, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (569, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 569 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 569 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.233, 0.545, 0.714, 0.455, 0.238] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.917, 0.000, 0.550, 0.455, 0.455, 0.000, 0.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.815, 0.643, 0.800, 0.750, 0.429] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.800, 0.000, 0.556, 0.182, 0.500, 0.071, 0.400] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 0 matches and 72 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

132.0
Analisando o arquivo: diverg(20)455_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 455), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)455_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 793 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)399_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 399), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)399_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 491
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 491 weight vectors
  Containing 172 true matches and 319 true non-matches
    (35.03% true matches)
  Identified 473 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   461  (97.46%)
          2 :     9  (1.90%)
          3 :     2  (0.42%)
          6 :     1  (0.21%)

Identified 0 non-pure unique weight vectors (from 473 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 154
     0.000 : 319

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 491
  Number of unique weight vectors: 473

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (473, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 473 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 473 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 22 matches and 58 non-matches
    Purity of oracle classification:  0.725
    Entropy of oracle classification: 0.849
    Number of true matches:      22
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 393 weight vectors
  Based on 22 matches and 58 non-matches
  Classified 96 matches and 297 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (96, 0.725, 0.8485481782946158, 0.275)
    (297, 0.725, 0.8485481782946158, 0.275)

Current size of match and non-match training data sets: 22 / 58

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 96 weight vectors
- Estimated match proportion 0.275

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 96 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.933, 1.000, 0.952, 1.000, 1.000, 0.944, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(10)85_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 85), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)85_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 664
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 664 weight vectors
  Containing 200 true matches and 464 true non-matches
    (30.12% true matches)
  Identified 619 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   585  (94.51%)
          2 :    31  (5.01%)
          3 :     2  (0.32%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 619 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 443

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 663
  Number of unique weight vectors: 619

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (619, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 619 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 619 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 536 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 159 matches and 377 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (377, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 159 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 159 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 45 matches and 11 non-matches
    Purity of oracle classification:  0.804
    Entropy of oracle classification: 0.715
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)589_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 589), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)589_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 777
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 777 weight vectors
  Containing 223 true matches and 554 true non-matches
    (28.70% true matches)
  Identified 723 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   686  (94.88%)
          2 :    34  (4.70%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 723 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 533

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 776
  Number of unique weight vectors: 723

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (723, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 723 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 723 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 638 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 114 matches and 524 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (524, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 524 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 524 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.714, 0.727, 0.750, 0.294, 0.833] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 18 matches and 54 non-matches
    Purity of oracle classification:  0.750
    Entropy of oracle classification: 0.811
    Number of true matches:      18
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)877_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 877), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)877_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 672
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 672 weight vectors
  Containing 217 true matches and 455 true non-matches
    (32.29% true matches)
  Identified 639 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   623  (97.50%)
          2 :    13  (2.03%)
          3 :     2  (0.31%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 639 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 454

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 671
  Number of unique weight vectors: 639

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (639, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 639 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 639 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 30 matches and 53 non-matches
    Purity of oracle classification:  0.639
    Entropy of oracle classification: 0.944
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 556 weight vectors
  Based on 30 matches and 53 non-matches
  Classified 154 matches and 402 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (154, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)
    (402, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)

Current size of match and non-match training data sets: 30 / 53

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 402 weight vectors
- Estimated match proportion 0.361

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 402 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.556, 0.429, 0.500, 0.700, 0.643] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 4 matches and 69 non-matches
    Purity of oracle classification:  0.945
    Entropy of oracle classification: 0.306
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)240_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 240), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)240_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 804
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 804 weight vectors
  Containing 188 true matches and 616 true non-matches
    (23.38% true matches)
  Identified 762 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   731  (95.93%)
          2 :    28  (3.67%)
          3 :     2  (0.26%)
         11 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 762 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 595

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 803
  Number of unique weight vectors: 762

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (762, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 762 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 762 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.722, 0.471, 0.545, 0.579] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.800, 0.000, 0.556, 0.182, 0.500, 0.071, 0.400] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.344, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.300, 0.524, 0.727, 0.762] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 677 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 130 matches and 547 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (547, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 130 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 46 matches and 5 non-matches
    Purity of oracle classification:  0.902
    Entropy of oracle classification: 0.463
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(10)314_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.977444
recall                 0.434783
f-measure              0.601852
da                          133
dm                            0
ndm                           0
tp                          130
fp                            3
tn                  4.76529e+07
fn                          169
Name: (10, 1 - acm diverg, 314), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)314_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 769
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 769 weight vectors
  Containing 124 true matches and 645 true non-matches
    (16.12% true matches)
  Identified 738 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   710  (96.21%)
          2 :    25  (3.39%)
          3 :     3  (0.41%)

Identified 0 non-pure unique weight vectors (from 738 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 113
     0.000 : 625

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 769
  Number of unique weight vectors: 738

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (738, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 738 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 653 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 103 matches and 550 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (550, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 103 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 103 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 37 matches and 11 non-matches
    Purity of oracle classification:  0.771
    Entropy of oracle classification: 0.777
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

133.0
Analisando o arquivo: diverg(20)307_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 307), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)307_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 789
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 789 weight vectors
  Containing 225 true matches and 564 true non-matches
    (28.52% true matches)
  Identified 750 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   731  (97.47%)
          2 :    16  (2.13%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 750 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 561

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 788
  Number of unique weight vectors: 750

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (750, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 750 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 750 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 34 matches and 51 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 665 weight vectors
  Based on 34 matches and 51 non-matches
  Classified 153 matches and 512 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6, 0.9709505944546686, 0.4)
    (512, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 34 / 51

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 153 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 50 matches and 8 non-matches
    Purity of oracle classification:  0.862
    Entropy of oracle classification: 0.579
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)861_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 861), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)861_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)752_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 752), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)752_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1067
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1067 weight vectors
  Containing 221 true matches and 846 true non-matches
    (20.71% true matches)
  Identified 1011 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   975  (96.44%)
          2 :    33  (3.26%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1011 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 825

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1066
  Number of unique weight vectors: 1011

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1011, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1011 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1011 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 924 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 106 matches and 818 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (818, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 106 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 106 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 44 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)337_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 337), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)337_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 450
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 450 weight vectors
  Containing 199 true matches and 251 true non-matches
    (44.22% true matches)
  Identified 418 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   402  (96.17%)
          2 :    13  (3.11%)
          3 :     2  (0.48%)
         16 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 418 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 248

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 449
  Number of unique weight vectors: 418

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (418, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 418 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 418 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 37 matches and 41 non-matches
    Purity of oracle classification:  0.526
    Entropy of oracle classification: 0.998
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 340 weight vectors
  Based on 37 matches and 41 non-matches
  Classified 133 matches and 207 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)
    (207, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)

Current size of match and non-match training data sets: 37 / 41

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 133 weight vectors
- Estimated match proportion 0.474

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 133 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 50 matches and 6 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.491
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(20)882_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 882), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)882_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)425_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (10, 1 - acm diverg, 425), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)425_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 301
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 301 weight vectors
  Containing 203 true matches and 98 true non-matches
    (67.44% true matches)
  Identified 270 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   257  (95.19%)
          2 :    10  (3.70%)
          3 :     2  (0.74%)
         18 :     1  (0.37%)

Identified 1 non-pure unique weight vectors (from 270 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 97

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 300
  Number of unique weight vectors: 270

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (270, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 270 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Perform initial selection using "far" method

Farthest first selection of 71 weight vectors from 270 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 37 matches and 34 non-matches
    Purity of oracle classification:  0.521
    Entropy of oracle classification: 0.999
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  34
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 199 weight vectors
  Based on 37 matches and 34 non-matches
  Classified 141 matches and 58 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 71
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.5211267605633803, 0.9987117514654895, 0.5211267605633803)
    (58, 0.5211267605633803, 0.9987117514654895, 0.5211267605633803)

Current size of match and non-match training data sets: 37 / 34

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 141 weight vectors
- Estimated match proportion 0.521

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.950, 0.923, 0.941] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 48 matches and 9 non-matches
    Purity of oracle classification:  0.842
    Entropy of oracle classification: 0.629
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(10)933_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 933), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)933_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 274
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 274 weight vectors
  Containing 171 true matches and 103 true non-matches
    (62.41% true matches)
  Identified 256 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   247  (96.48%)
          2 :     6  (2.34%)
          3 :     2  (0.78%)
          9 :     1  (0.39%)

Identified 1 non-pure unique weight vectors (from 256 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 153
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 102

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 265
  Number of unique weight vectors: 255

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (255, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 255 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 70

Perform initial selection using "far" method

Farthest first selection of 70 weight vectors from 255 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 35 matches and 35 non-matches
    Purity of oracle classification:  0.500
    Entropy of oracle classification: 1.000
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  35
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 185 weight vectors
  Based on 35 matches and 35 non-matches
  Classified 120 matches and 65 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 70
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (120, 0.5, 1.0, 0.5)
    (65, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 35 / 35

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 120 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 120 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 47 matches and 7 non-matches
    Purity of oracle classification:  0.870
    Entropy of oracle classification: 0.556
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(15)136_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (15, 1 - acm diverg, 136), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)136_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1039
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1039 weight vectors
  Containing 167 true matches and 872 true non-matches
    (16.07% true matches)
  Identified 1000 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   971  (97.10%)
          2 :    26  (2.60%)
          3 :     2  (0.20%)
         10 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1000 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 851

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1038
  Number of unique weight vectors: 1000

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1000, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1000 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1000 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 913 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 57 matches and 856 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (57, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (856, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 57 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 33

Farthest first selection of 33 weight vectors from 57 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 1.000, 1.000, 0.952, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [0.420, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)

Perform oracle with 100.00 accuracy on 33 weight vectors
  The oracle will correctly classify 33 weight vectors and wrongly classify 0
  Classified 33 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 33 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(20)894_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 894), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)894_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1035
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1035 weight vectors
  Containing 223 true matches and 812 true non-matches
    (21.55% true matches)
  Identified 981 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   944  (96.23%)
          2 :    34  (3.47%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 981 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 791

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1034
  Number of unique weight vectors: 981

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (981, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 981 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 981 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 894 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 156 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (738, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 738 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 5 matches and 70 non-matches
    Purity of oracle classification:  0.933
    Entropy of oracle classification: 0.353
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)146_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 146), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)146_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 717
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 717 weight vectors
  Containing 217 true matches and 500 true non-matches
    (30.26% true matches)
  Identified 681 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   662  (97.21%)
          2 :    16  (2.35%)
          3 :     2  (0.29%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 681 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 497

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 716
  Number of unique weight vectors: 681

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (681, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 681 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 681 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 34 matches and 50 non-matches
    Purity of oracle classification:  0.595
    Entropy of oracle classification: 0.974
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 597 weight vectors
  Based on 34 matches and 50 non-matches
  Classified 288 matches and 309 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (288, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)
    (309, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)

Current size of match and non-match training data sets: 34 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 288 weight vectors
- Estimated match proportion 0.405

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 288 vectors
  The selected farthest weight vectors are:
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 45 matches and 25 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  25
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)469_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 469), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)469_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1069
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1069 weight vectors
  Containing 221 true matches and 848 true non-matches
    (20.67% true matches)
  Identified 1013 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   977  (96.45%)
          2 :    33  (3.26%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1013 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1068
  Number of unique weight vectors: 1013

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1013, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1013 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1013 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 926 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 142 matches and 784 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (784, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 784 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 784 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 8 matches and 66 non-matches
    Purity of oracle classification:  0.892
    Entropy of oracle classification: 0.494
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)775_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 775), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)775_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1005
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1005 weight vectors
  Containing 195 true matches and 810 true non-matches
    (19.40% true matches)
  Identified 963 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   928  (96.37%)
          2 :    32  (3.32%)
          3 :     2  (0.21%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 963 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 790

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1005
  Number of unique weight vectors: 963

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (963, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 963 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 963 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 876 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 138 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (738, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 738 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 5 matches and 70 non-matches
    Purity of oracle classification:  0.933
    Entropy of oracle classification: 0.353
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(20)126_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 126), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)126_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1035
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1035 weight vectors
  Containing 223 true matches and 812 true non-matches
    (21.55% true matches)
  Identified 981 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   944  (96.23%)
          2 :    34  (3.47%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 981 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 791

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1034
  Number of unique weight vectors: 981

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (981, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 981 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 981 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 894 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 156 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (738, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 738 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 5 matches and 70 non-matches
    Purity of oracle classification:  0.933
    Entropy of oracle classification: 0.353
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)898_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.977444
recall                 0.434783
f-measure              0.601852
da                          133
dm                            0
ndm                           0
tp                          130
fp                            3
tn                  4.76529e+07
fn                          169
Name: (10, 1 - acm diverg, 898), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)898_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 598
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 598 weight vectors
  Containing 131 true matches and 467 true non-matches
    (21.91% true matches)
  Identified 585 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   577  (98.63%)
          2 :     5  (0.85%)
          3 :     2  (0.34%)
          5 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 585 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 118
     0.000 : 467

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 598
  Number of unique weight vectors: 585

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (585, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 585 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 585 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 30 matches and 52 non-matches
    Purity of oracle classification:  0.634
    Entropy of oracle classification: 0.947
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 503 weight vectors
  Based on 30 matches and 52 non-matches
  Classified 76 matches and 427 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (76, 0.6341463414634146, 0.9474351361840306, 0.36585365853658536)
    (427, 0.6341463414634146, 0.9474351361840306, 0.36585365853658536)

Current size of match and non-match training data sets: 30 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 427 weight vectors
- Estimated match proportion 0.366

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 427 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.409, 0.654, 0.500, 0.516, 0.333] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.786, 0.833, 0.545, 0.478, 0.346] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 8 matches and 66 non-matches
    Purity of oracle classification:  0.892
    Entropy of oracle classification: 0.494
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

133.0
Analisando o arquivo: diverg(20)834_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 834), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)834_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 131 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)403_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 403), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)403_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 95 matches and 853 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (95, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (853, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 95 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 95 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)203_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 203), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)203_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1039
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1039 weight vectors
  Containing 220 true matches and 819 true non-matches
    (21.17% true matches)
  Identified 983 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   947  (96.34%)
          2 :    33  (3.36%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 983 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 798

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1038
  Number of unique weight vectors: 983

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (983, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 983 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 983 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 32 matches and 55 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 896 weight vectors
  Based on 32 matches and 55 non-matches
  Classified 324 matches and 572 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (324, 0.632183908045977, 0.9489804585630242, 0.367816091954023)
    (572, 0.632183908045977, 0.9489804585630242, 0.367816091954023)

Current size of match and non-match training data sets: 32 / 55

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 324 weight vectors
- Estimated match proportion 0.368

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 324 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 40 matches and 30 non-matches
    Purity of oracle classification:  0.571
    Entropy of oracle classification: 0.985
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  30
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)250_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 250), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)250_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 224 true matches and 575 true non-matches
    (28.04% true matches)
  Identified 760 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   741  (97.50%)
          2 :    16  (2.11%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 760 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 572

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 760

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (760, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 760 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 675 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 149 matches and 526 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (526, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 149 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 49 matches and 6 non-matches
    Purity of oracle classification:  0.891
    Entropy of oracle classification: 0.497
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)734_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 734), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)734_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 795
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 795 weight vectors
  Containing 224 true matches and 571 true non-matches
    (28.18% true matches)
  Identified 756 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   737  (97.49%)
          2 :    16  (2.12%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 756 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 568

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 794
  Number of unique weight vectors: 756

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (756, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 756 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 671 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 147 matches and 524 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (524, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 524 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 524 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 7 matches and 67 non-matches
    Purity of oracle classification:  0.905
    Entropy of oracle classification: 0.452
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)16_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 16), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)16_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1027
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1027 weight vectors
  Containing 223 true matches and 804 true non-matches
    (21.71% true matches)
  Identified 973 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   936  (96.20%)
          2 :    34  (3.49%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 973 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 783

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 973

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (973, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 973 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 973 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 886 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 131 matches and 755 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (755, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 755 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 755 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)184_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 184), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)184_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 847
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 847 weight vectors
  Containing 214 true matches and 633 true non-matches
    (25.27% true matches)
  Identified 793 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   758  (95.59%)
          2 :    32  (4.04%)
          3 :     2  (0.25%)
         19 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 793 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 612

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 846
  Number of unique weight vectors: 793

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (793, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 793 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 708 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 145 matches and 563 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (563, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 145 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)982_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 982), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)982_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 489
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 489 weight vectors
  Containing 192 true matches and 297 true non-matches
    (39.26% true matches)
  Identified 459 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   445  (96.95%)
          2 :    11  (2.40%)
          3 :     2  (0.44%)
         16 :     1  (0.22%)

Identified 1 non-pure unique weight vectors (from 459 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 296

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 488
  Number of unique weight vectors: 459

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (459, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 459 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 459 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.786, 0.833, 0.545, 0.478, 0.346] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.364, 0.619, 0.471, 0.600, 0.533] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 27 matches and 52 non-matches
    Purity of oracle classification:  0.658
    Entropy of oracle classification: 0.927
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 380 weight vectors
  Based on 27 matches and 52 non-matches
  Classified 144 matches and 236 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6582278481012658, 0.9265044456232998, 0.34177215189873417)
    (236, 0.6582278481012658, 0.9265044456232998, 0.34177215189873417)

Current size of match and non-match training data sets: 27 / 52

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 236 weight vectors
- Estimated match proportion 0.342

Sample size for this cluster: 63

Farthest first selection of 63 weight vectors from 236 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.750, 0.905, 0.667, 0.500, 0.571] (False)
    [1.000, 0.000, 0.579, 0.583, 0.522, 0.417, 0.563] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.813, 0.619, 0.333, 0.500, 0.571] (False)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.522, 0.786, 0.800, 0.824, 0.667] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [1.000, 0.000, 0.923, 0.667, 0.667, 0.412, 0.571] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.565, 0.857, 0.833, 0.412, 0.667] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.750, 0.875, 0.545, 0.750, 0.571] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.786, 0.857, 0.667, 0.412, 0.857] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.600, 0.700, 0.600, 0.611, 0.706] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 63 weight vectors
  The oracle will correctly classify 63 weight vectors and wrongly classify 0
  Classified 0 matches and 63 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 63 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)271_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 271), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)271_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 754
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 754 weight vectors
  Containing 222 true matches and 532 true non-matches
    (29.44% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   699  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 529

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 753
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 135 matches and 499 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (499, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 499 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 499 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 13 matches and 60 non-matches
    Purity of oracle classification:  0.822
    Entropy of oracle classification: 0.676
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)732_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 732), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)732_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)440_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 440), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)440_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1064
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1064 weight vectors
  Containing 209 true matches and 855 true non-matches
    (19.64% true matches)
  Identified 1017 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.56%)
          2 :    32  (3.15%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1017 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 834

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1063
  Number of unique weight vectors: 1017

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1017, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1017 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1017 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 930 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 232 matches and 698 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (232, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (698, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 698 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 698 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)654_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 654), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)654_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 208 true matches and 867 true non-matches
    (19.35% true matches)
  Identified 1028 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   993  (96.60%)
          2 :    32  (3.11%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1028 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1028

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1028, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1028 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1028 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 940 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 123 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 817 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 817 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 11 matches and 60 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.622
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)694_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 694), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)694_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1026
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1026 weight vectors
  Containing 198 true matches and 828 true non-matches
    (19.30% true matches)
  Identified 984 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   949  (96.44%)
          2 :    32  (3.25%)
          3 :     2  (0.20%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 984 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 808

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 984

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (984, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 984 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 984 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 897 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 93 matches and 804 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (93, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (804, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 804 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 804 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)305_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 305), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)305_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)684_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 684), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)684_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 256
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 256 weight vectors
  Containing 209 true matches and 47 true non-matches
    (81.64% true matches)
  Identified 225 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   210  (93.33%)
          2 :    12  (5.33%)
          3 :     2  (0.89%)
         16 :     1  (0.44%)

Identified 1 non-pure unique weight vectors (from 225 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 46

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 255
  Number of unique weight vectors: 225

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (225, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 225 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 67

Perform initial selection using "far" method

Farthest first selection of 67 weight vectors from 225 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 40 matches and 27 non-matches
    Purity of oracle classification:  0.597
    Entropy of oracle classification: 0.973
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  27
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 158 weight vectors
  Based on 40 matches and 27 non-matches
  Classified 158 matches and 0 non-matches

43.0
Analisando o arquivo: diverg(20)35_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 35), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)35_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1064
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1064 weight vectors
  Containing 209 true matches and 855 true non-matches
    (19.64% true matches)
  Identified 1017 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.56%)
          2 :    32  (3.15%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1017 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 834

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1063
  Number of unique weight vectors: 1017

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1017, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1017 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1017 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 930 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 139 matches and 791 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (139, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (791, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 139 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 139 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)295_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 295), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)295_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)976_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 976), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)976_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 770
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 770 weight vectors
  Containing 207 true matches and 563 true non-matches
    (26.88% true matches)
  Identified 741 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   724  (97.71%)
          2 :    14  (1.89%)
          3 :     2  (0.27%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 741 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 560

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 769
  Number of unique weight vectors: 741

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (741, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 741 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 741 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 35 matches and 50 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.977
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 656 weight vectors
  Based on 35 matches and 50 non-matches
  Classified 152 matches and 504 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.5882352941176471, 0.9774178175281716, 0.4117647058823529)
    (504, 0.5882352941176471, 0.9774178175281716, 0.4117647058823529)

Current size of match and non-match training data sets: 35 / 50

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 152 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 152 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 49 matches and 9 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.623
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)640_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 640), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)640_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1043
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1043 weight vectors
  Containing 222 true matches and 821 true non-matches
    (21.28% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   952  (96.26%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 800

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1042
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 145 matches and 757 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (757, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 145 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)381_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 381), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)381_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 148 matches and 784 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (784, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 784 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 784 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 8 matches and 66 non-matches
    Purity of oracle classification:  0.892
    Entropy of oracle classification: 0.494
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)383_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 383), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)383_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 797
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 797 weight vectors
  Containing 225 true matches and 572 true non-matches
    (28.23% true matches)
  Identified 740 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   703  (95.00%)
          2 :    34  (4.59%)
          3 :     2  (0.27%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 740 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 551

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 796
  Number of unique weight vectors: 740

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (740, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 740 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 740 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 655 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 329 matches and 326 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (329, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (326, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 329 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 329 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 42 matches and 26 non-matches
    Purity of oracle classification:  0.618
    Entropy of oracle classification: 0.960
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)607_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 607), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)607_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 476
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 476 weight vectors
  Containing 212 true matches and 264 true non-matches
    (44.54% true matches)
  Identified 442 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   425  (96.15%)
          2 :    14  (3.17%)
          3 :     2  (0.45%)
         17 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 442 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 261

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 475
  Number of unique weight vectors: 442

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (442, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 442 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 442 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 36 matches and 43 non-matches
    Purity of oracle classification:  0.544
    Entropy of oracle classification: 0.994
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 363 weight vectors
  Based on 36 matches and 43 non-matches
  Classified 140 matches and 223 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.5443037974683544, 0.9943290455933882, 0.45569620253164556)
    (223, 0.5443037974683544, 0.9943290455933882, 0.45569620253164556)

Current size of match and non-match training data sets: 36 / 43

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 0.99
- Size 223 weight vectors
- Estimated match proportion 0.456

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 223 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 5 matches and 62 non-matches
    Purity of oracle classification:  0.925
    Entropy of oracle classification: 0.383
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)618_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 618), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)618_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 927
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 927 weight vectors
  Containing 218 true matches and 709 true non-matches
    (23.52% true matches)
  Identified 872 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   836  (95.87%)
          2 :    33  (3.78%)
          3 :     2  (0.23%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 872 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 688

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 926
  Number of unique weight vectors: 872

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (872, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 872 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 872 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 786 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 157 matches and 629 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (629, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 629 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 629 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 3 matches and 71 non-matches
    Purity of oracle classification:  0.959
    Entropy of oracle classification: 0.245
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)150_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 150), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)150_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 393
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 393 weight vectors
  Containing 213 true matches and 180 true non-matches
    (54.20% true matches)
  Identified 356 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   338  (94.94%)
          2 :    15  (4.21%)
          3 :     2  (0.56%)
         19 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 356 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 177

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 392
  Number of unique weight vectors: 356

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (356, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 356 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 356 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 31 matches and 45 non-matches
    Purity of oracle classification:  0.592
    Entropy of oracle classification: 0.975
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 280 weight vectors
  Based on 31 matches and 45 non-matches
  Classified 152 matches and 128 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.5921052631578947, 0.9753817903274212, 0.40789473684210525)
    (128, 0.5921052631578947, 0.9753817903274212, 0.40789473684210525)

Current size of match and non-match training data sets: 31 / 45

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 152 weight vectors
- Estimated match proportion 0.408

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 152 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 51 matches and 7 non-matches
    Purity of oracle classification:  0.879
    Entropy of oracle classification: 0.531
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)204_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (20, 1 - acm diverg, 204), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)204_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 963
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 963 weight vectors
  Containing 212 true matches and 751 true non-matches
    (22.01% true matches)
  Identified 910 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   875  (96.15%)
          2 :    32  (3.52%)
          3 :     2  (0.22%)
         18 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 910 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 730

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 962
  Number of unique weight vectors: 910

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (910, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 910 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 910 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 823 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 0 matches and 823 non-matches

48.0
Analisando o arquivo: diverg(15)955_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 955), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)955_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 894
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 894 weight vectors
  Containing 190 true matches and 704 true non-matches
    (21.25% true matches)
  Identified 854 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   820  (96.02%)
          2 :    31  (3.63%)
          3 :     2  (0.23%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 854 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.000 : 684

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 894
  Number of unique weight vectors: 854

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (854, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 854 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 854 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 27 matches and 59 non-matches
    Purity of oracle classification:  0.686
    Entropy of oracle classification: 0.898
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 768 weight vectors
  Based on 27 matches and 59 non-matches
  Classified 117 matches and 651 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (117, 0.686046511627907, 0.8976844934141643, 0.313953488372093)
    (651, 0.686046511627907, 0.8976844934141643, 0.313953488372093)

Current size of match and non-match training data sets: 27 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 651 weight vectors
- Estimated match proportion 0.314

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 651 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)105_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 105), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)105_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 179 matches and 760 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (760, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 760 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)984_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.977444
recall                 0.434783
f-measure              0.601852
da                          133
dm                            0
ndm                           0
tp                          130
fp                            3
tn                  4.76529e+07
fn                          169
Name: (10, 1 - acm diverg, 984), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)984_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 256
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 256 weight vectors
  Containing 125 true matches and 131 true non-matches
    (48.83% true matches)
  Identified 243 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   233  (95.88%)
          2 :     7  (2.88%)
          3 :     3  (1.23%)

Identified 0 non-pure unique weight vectors (from 243 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 114
     0.000 : 129

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 256
  Number of unique weight vectors: 243

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (243, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 243 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 69

Perform initial selection using "far" method

Farthest first selection of 69 weight vectors from 243 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 30 matches and 39 non-matches
    Purity of oracle classification:  0.565
    Entropy of oracle classification: 0.988
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 174 weight vectors
  Based on 30 matches and 39 non-matches
  Classified 85 matches and 89 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 69
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (85, 0.5652173913043478, 0.9876925088958034, 0.43478260869565216)
    (89, 0.5652173913043478, 0.9876925088958034, 0.43478260869565216)

Current size of match and non-match training data sets: 30 / 39

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.99
- Size 89 weight vectors
- Estimated match proportion 0.435

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 89 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 2 matches and 44 non-matches
    Purity of oracle classification:  0.957
    Entropy of oracle classification: 0.258
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

133.0
Analisando o arquivo: diverg(20)344_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 344), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)344_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 848
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 848 weight vectors
  Containing 214 true matches and 634 true non-matches
    (25.24% true matches)
  Identified 794 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   759  (95.59%)
          2 :    32  (4.03%)
          3 :     2  (0.25%)
         19 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 794 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 847
  Number of unique weight vectors: 794

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (794, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 794 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 794 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 709 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 145 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (564, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 145 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)189_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 189), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)189_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)315_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (10, 1 - acm diverg, 315), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)315_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 346
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 346 weight vectors
  Containing 164 true matches and 182 true non-matches
    (47.40% true matches)
  Identified 330 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   320  (96.97%)
          2 :     7  (2.12%)
          3 :     2  (0.61%)
          6 :     1  (0.30%)

Identified 0 non-pure unique weight vectors (from 330 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.000 : 182

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 346
  Number of unique weight vectors: 330

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (330, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 330 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 330 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 39 matches and 35 non-matches
    Purity of oracle classification:  0.527
    Entropy of oracle classification: 0.998
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  35
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 256 weight vectors
  Based on 39 matches and 35 non-matches
  Classified 115 matches and 141 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 74
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (115, 0.527027027027027, 0.9978913098356863, 0.527027027027027)
    (141, 0.527027027027027, 0.9978913098356863, 0.527027027027027)

Current size of match and non-match training data sets: 39 / 35

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 115 weight vectors
- Estimated match proportion 0.527

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 115 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 42 matches and 10 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.706
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  10
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(20)626_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 626), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)626_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 224 true matches and 575 true non-matches
    (28.04% true matches)
  Identified 760 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   741  (97.50%)
          2 :    16  (2.11%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 760 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 572

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 760

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (760, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 760 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 675 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 149 matches and 526 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (526, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 526 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 526 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.433, 0.667, 0.500, 0.636, 0.421] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 4 matches and 71 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.300
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)48_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 48), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)48_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 689
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 689 weight vectors
  Containing 219 true matches and 470 true non-matches
    (31.79% true matches)
  Identified 656 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   640  (97.56%)
          2 :    13  (1.98%)
          3 :     2  (0.30%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 656 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 469

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 688
  Number of unique weight vectors: 656

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (656, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 656 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 656 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 572 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 145 matches and 427 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (427, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 145 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)179_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 179), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)179_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 141 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)786_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 786), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)786_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 460
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 460 weight vectors
  Containing 210 true matches and 250 true non-matches
    (45.65% true matches)
  Identified 426 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   409  (96.01%)
          2 :    14  (3.29%)
          3 :     2  (0.47%)
         17 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 426 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 247

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 459
  Number of unique weight vectors: 426

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (426, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 426 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 426 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 37 matches and 41 non-matches
    Purity of oracle classification:  0.526
    Entropy of oracle classification: 0.998
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 348 weight vectors
  Based on 37 matches and 41 non-matches
  Classified 246 matches and 102 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (246, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)
    (102, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)

Current size of match and non-match training data sets: 37 / 41

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 246 weight vectors
- Estimated match proportion 0.474

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 246 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.902, 1.000, 0.182, 0.071, 0.182, 0.222, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 43 matches and 26 non-matches
    Purity of oracle classification:  0.623
    Entropy of oracle classification: 0.956
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)624_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 624), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)624_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 997
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 997 weight vectors
  Containing 222 true matches and 775 true non-matches
    (22.27% true matches)
  Identified 943 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   906  (96.08%)
          2 :    34  (3.61%)
          3 :     2  (0.21%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 943 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 754

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 996
  Number of unique weight vectors: 943

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (943, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 943 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 943 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 32 matches and 55 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 856 weight vectors
  Based on 32 matches and 55 non-matches
  Classified 302 matches and 554 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (302, 0.632183908045977, 0.9489804585630242, 0.367816091954023)
    (554, 0.632183908045977, 0.9489804585630242, 0.367816091954023)

Current size of match and non-match training data sets: 32 / 55

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 302 weight vectors
- Estimated match proportion 0.368

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 302 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 44 matches and 25 non-matches
    Purity of oracle classification:  0.638
    Entropy of oracle classification: 0.945
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  25
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)655_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 655), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)655_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1043
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1043 weight vectors
  Containing 222 true matches and 821 true non-matches
    (21.28% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   952  (96.26%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 800

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1042
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 145 matches and 757 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (757, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 145 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)143_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 143), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)143_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 881
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 881 weight vectors
  Containing 212 true matches and 669 true non-matches
    (24.06% true matches)
  Identified 829 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   793  (95.66%)
          2 :    33  (3.98%)
          3 :     2  (0.24%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 829 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 648

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 880
  Number of unique weight vectors: 829

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (829, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 829 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 829 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 743 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 162 matches and 581 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (581, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 581 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 581 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)407_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 407), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)407_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 996
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 996 weight vectors
  Containing 221 true matches and 775 true non-matches
    (22.19% true matches)
  Identified 942 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   905  (96.07%)
          2 :    34  (3.61%)
          3 :     2  (0.21%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 942 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 754

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 995
  Number of unique weight vectors: 942

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (942, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 942 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 942 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 855 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 301 matches and 554 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (301, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (554, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 301 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 301 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.600, 1.000, 0.217, 0.132, 0.167, 0.125, 0.188] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 42 matches and 26 non-matches
    Purity of oracle classification:  0.618
    Entropy of oracle classification: 0.960
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)173_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 173), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)173_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 443
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 443 weight vectors
  Containing 205 true matches and 238 true non-matches
    (46.28% true matches)
  Identified 417 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   403  (96.64%)
          2 :    11  (2.64%)
          3 :     2  (0.48%)
         12 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 417 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 237

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 442
  Number of unique weight vectors: 417

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (417, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 417 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 417 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 35 matches and 43 non-matches
    Purity of oracle classification:  0.551
    Entropy of oracle classification: 0.992
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 339 weight vectors
  Based on 35 matches and 43 non-matches
  Classified 139 matches and 200 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (139, 0.5512820512820513, 0.9923985003332222, 0.44871794871794873)
    (200, 0.5512820512820513, 0.9923985003332222, 0.44871794871794873)

Current size of match and non-match training data sets: 35 / 43

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 200 weight vectors
- Estimated match proportion 0.449

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 200 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 9 matches and 55 non-matches
    Purity of oracle classification:  0.859
    Entropy of oracle classification: 0.586
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)929_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 929), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)929_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 814
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 814 weight vectors
  Containing 227 true matches and 587 true non-matches
    (27.89% true matches)
  Identified 757 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   720  (95.11%)
          2 :    34  (4.49%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 757 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 566

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 813
  Number of unique weight vectors: 757

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (757, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 757 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 757 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 672 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 160 matches and 512 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (160, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (512, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 160 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 160 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)419_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 419), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)419_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 589
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 589 weight vectors
  Containing 206 true matches and 383 true non-matches
    (34.97% true matches)
  Identified 555 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   537  (96.76%)
          2 :    15  (2.70%)
          3 :     2  (0.36%)
         16 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 555 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 380

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 588
  Number of unique weight vectors: 555

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (555, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 555 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 555 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 30 matches and 52 non-matches
    Purity of oracle classification:  0.634
    Entropy of oracle classification: 0.947
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 473 weight vectors
  Based on 30 matches and 52 non-matches
  Classified 148 matches and 325 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6341463414634146, 0.9474351361840306, 0.36585365853658536)
    (325, 0.6341463414634146, 0.9474351361840306, 0.36585365853658536)

Current size of match and non-match training data sets: 30 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 148 weight vectors
- Estimated match proportion 0.366

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 50 matches and 6 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.491
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)681_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 681), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)681_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 118 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 118 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)274_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 274), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)274_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 637
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 637 weight vectors
  Containing 195 true matches and 442 true non-matches
    (30.61% true matches)
  Identified 610 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   594  (97.38%)
          2 :    13  (2.13%)
          3 :     2  (0.33%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 610 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 439

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 636
  Number of unique weight vectors: 610

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (610, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 610 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 610 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 527 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 144 matches and 383 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (383, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 383 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 383 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 4 matches and 67 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.313
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)332_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 332), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)332_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 147 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (537, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 147 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 147 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 53 matches and 2 non-matches
    Purity of oracle classification:  0.964
    Entropy of oracle classification: 0.225
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)675_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (20, 1 - acm diverg, 675), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)675_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 920
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 920 weight vectors
  Containing 215 true matches and 705 true non-matches
    (23.37% true matches)
  Identified 868 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   832  (95.85%)
          2 :    33  (3.80%)
          3 :     2  (0.23%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 868 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 684

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 919
  Number of unique weight vectors: 868

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (868, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 868 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 868 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 782 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 158 matches and 624 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (624, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 158 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 158 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)4_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 4), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)4_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 991
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 991 weight vectors
  Containing 194 true matches and 797 true non-matches
    (19.58% true matches)
  Identified 949 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   914  (96.31%)
          2 :    32  (3.37%)
          3 :     2  (0.21%)
          7 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 949 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.000 : 777

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 991
  Number of unique weight vectors: 949

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (949, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 949 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 949 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 862 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 287 matches and 575 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (287, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (575, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 575 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 575 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(15)539_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 539), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)539_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 201 true matches and 752 true non-matches
    (21.09% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.26%)
          2 :    31  (3.41%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 110 matches and 711 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (110, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (711, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 711 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 711 vectors
  The selected farthest weight vectors are:
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 13 matches and 58 non-matches
    Purity of oracle classification:  0.817
    Entropy of oracle classification: 0.687
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)577_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (20, 1 - acm diverg, 577), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)577_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1036
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1036 weight vectors
  Containing 188 true matches and 848 true non-matches
    (18.15% true matches)
  Identified 994 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   963  (96.88%)
          2 :    28  (2.82%)
          3 :     2  (0.20%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 994 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1035
  Number of unique weight vectors: 994

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (994, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 994 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 994 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 907 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 77 matches and 830 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (77, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (830, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 77 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 38

Farthest first selection of 38 weight vectors from 77 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)

Perform oracle with 100.00 accuracy on 38 weight vectors
  The oracle will correctly classify 38 weight vectors and wrongly classify 0
  Classified 38 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 38 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(15)665_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 665), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)665_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 29 matches and 59 non-matches
    Purity of oracle classification:  0.670
    Entropy of oracle classification: 0.914
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 29 matches and 59 non-matches
  Classified 162 matches and 777 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)
    (777, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)

Current size of match and non-match training data sets: 29 / 59

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 162 weight vectors
- Estimated match proportion 0.330

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)660_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 660), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)660_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 346
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 346 weight vectors
  Containing 212 true matches and 134 true non-matches
    (61.27% true matches)
  Identified 312 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   294  (94.23%)
          2 :    15  (4.81%)
          3 :     2  (0.64%)
         16 :     1  (0.32%)

Identified 1 non-pure unique weight vectors (from 312 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 131

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 345
  Number of unique weight vectors: 312

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (312, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 312 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 73

Perform initial selection using "far" method

Farthest first selection of 73 weight vectors from 312 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 34 matches and 39 non-matches
    Purity of oracle classification:  0.534
    Entropy of oracle classification: 0.997
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 239 weight vectors
  Based on 34 matches and 39 non-matches
  Classified 150 matches and 89 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 73
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.5342465753424658, 0.9966132830150964, 0.4657534246575342)
    (89, 0.5342465753424658, 0.9966132830150964, 0.4657534246575342)

Current size of match and non-match training data sets: 34 / 39

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 89 weight vectors
- Estimated match proportion 0.466

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 89 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.800, 0.636, 0.563, 0.545, 0.722] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 2 matches and 44 non-matches
    Purity of oracle classification:  0.957
    Entropy of oracle classification: 0.258
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)864_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987952
recall                 0.274247
f-measure              0.429319
da                           83
dm                            0
ndm                           0
tp                           82
fp                            1
tn                  4.76529e+07
fn                          217
Name: (10, 1 - acm diverg, 864), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)864_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 502
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 502 weight vectors
  Containing 154 true matches and 348 true non-matches
    (30.68% true matches)
  Identified 485 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   477  (98.35%)
          2 :     5  (1.03%)
          3 :     2  (0.41%)
          9 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 485 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 137
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 347

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 493
  Number of unique weight vectors: 484

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (484, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 484 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 484 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.714, 0.353, 0.583, 0.571] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 27 matches and 53 non-matches
    Purity of oracle classification:  0.662
    Entropy of oracle classification: 0.922
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 404 weight vectors
  Based on 27 matches and 53 non-matches
  Classified 114 matches and 290 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6625, 0.9224062617590723, 0.3375)
    (290, 0.6625, 0.9224062617590723, 0.3375)

Current size of match and non-match training data sets: 27 / 53

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 290 weight vectors
- Estimated match proportion 0.338

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 290 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 0.000, 0.750, 0.667, 0.444, 0.765, 0.714] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.667, 0.000, 0.800, 0.684, 0.667, 0.529, 0.609] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.481, 0.643, 0.667, 0.350, 0.643] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 1 matches and 65 non-matches
    Purity of oracle classification:  0.985
    Entropy of oracle classification: 0.113
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

83.0
Analisando o arquivo: diverg(20)89_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 89), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)89_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)622_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 622), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)622_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1041
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1041 weight vectors
  Containing 222 true matches and 819 true non-matches
    (21.33% true matches)
  Identified 987 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   950  (96.25%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 987 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 798

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1040
  Number of unique weight vectors: 987

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (987, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 987 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 987 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 900 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 144 matches and 756 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (756, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 144 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 144 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)300_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 300), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)300_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 141 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)263_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 263), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)263_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 880
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 880 weight vectors
  Containing 208 true matches and 672 true non-matches
    (23.64% true matches)
  Identified 828 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   792  (95.65%)
          2 :    33  (3.99%)
          3 :     2  (0.24%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 828 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 651

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 879
  Number of unique weight vectors: 828

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (828, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 828 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 828 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 32 matches and 54 non-matches
    Purity of oracle classification:  0.628
    Entropy of oracle classification: 0.952
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 742 weight vectors
  Based on 32 matches and 54 non-matches
  Classified 168 matches and 574 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (168, 0.627906976744186, 0.9522656254366642, 0.37209302325581395)
    (574, 0.627906976744186, 0.9522656254366642, 0.37209302325581395)

Current size of match and non-match training data sets: 32 / 54

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 168 weight vectors
- Estimated match proportion 0.372

Sample size for this cluster: 59

Farthest first selection of 59 weight vectors from 168 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 59 weight vectors
  The oracle will correctly classify 59 weight vectors and wrongly classify 0
  Classified 47 matches and 12 non-matches
    Purity of oracle classification:  0.797
    Entropy of oracle classification: 0.729
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  12
    Number of false non-matches: 0

Deleted 59 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)285_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 285), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)285_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 831
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 831 weight vectors
  Containing 227 true matches and 604 true non-matches
    (27.32% true matches)
  Identified 774 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   737  (95.22%)
          2 :    34  (4.39%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 774 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 583

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 830
  Number of unique weight vectors: 774

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (774, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 774 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 774 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 689 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 151 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (538, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 151 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 51 matches and 3 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)280_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 280), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)280_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 397
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 397 weight vectors
  Containing 210 true matches and 187 true non-matches
    (52.90% true matches)
  Identified 362 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   345  (95.30%)
          2 :    14  (3.87%)
          3 :     2  (0.55%)
         18 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 362 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 184

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 396
  Number of unique weight vectors: 362

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (362, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 362 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 362 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 30 matches and 46 non-matches
    Purity of oracle classification:  0.605
    Entropy of oracle classification: 0.968
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 286 weight vectors
  Based on 30 matches and 46 non-matches
  Classified 140 matches and 146 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6052631578947368, 0.9677884628267679, 0.39473684210526316)
    (146, 0.6052631578947368, 0.9677884628267679, 0.39473684210526316)

Current size of match and non-match training data sets: 30 / 46

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.97
- Size 140 weight vectors
- Estimated match proportion 0.395

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 53 matches and 3 non-matches
    Purity of oracle classification:  0.946
    Entropy of oracle classification: 0.301
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)938_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 938), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)938_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1051
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1051 weight vectors
  Containing 223 true matches and 828 true non-matches
    (21.22% true matches)
  Identified 997 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   960  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 997 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 807

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1050
  Number of unique weight vectors: 997

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (997, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 997 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 997 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 910 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 792 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (792, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 792 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 792 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)376_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.982143
recall                 0.183946
f-measure              0.309859
da                           56
dm                            0
ndm                           0
tp                           55
fp                            1
tn                  4.76529e+07
fn                          244
Name: (10, 1 - acm diverg, 376), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)376_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 446
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 446 weight vectors
  Containing 205 true matches and 241 true non-matches
    (45.96% true matches)
  Identified 413 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   399  (96.61%)
          2 :    11  (2.66%)
          3 :     2  (0.48%)
         19 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 413 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 240

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 445
  Number of unique weight vectors: 413

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (413, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 413 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 413 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 34 matches and 44 non-matches
    Purity of oracle classification:  0.564
    Entropy of oracle classification: 0.988
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 335 weight vectors
  Based on 34 matches and 44 non-matches
  Classified 133 matches and 202 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.5641025641025641, 0.9881108365218301, 0.4358974358974359)
    (202, 0.5641025641025641, 0.9881108365218301, 0.4358974358974359)

Current size of match and non-match training data sets: 34 / 44

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 202 weight vectors
- Estimated match proportion 0.436

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 202 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.929, 1.000, 0.182, 0.238, 0.188, 0.146, 0.270] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 7 matches and 57 non-matches
    Purity of oracle classification:  0.891
    Entropy of oracle classification: 0.498
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

56.0
Analisando o arquivo: diverg(15)762_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 762), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)762_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 722
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 722 weight vectors
  Containing 217 true matches and 505 true non-matches
    (30.06% true matches)
  Identified 667 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   631  (94.60%)
          2 :    33  (4.95%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 667 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 484

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 721
  Number of unique weight vectors: 667

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (667, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 667 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 667 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 23 matches and 61 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 583 weight vectors
  Based on 23 matches and 61 non-matches
  Classified 0 matches and 583 non-matches

40.0
Analisando o arquivo: diverg(10)612_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 612), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)612_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 656
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 656 weight vectors
  Containing 215 true matches and 441 true non-matches
    (32.77% true matches)
  Identified 623 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   607  (97.43%)
          2 :    13  (2.09%)
          3 :     2  (0.32%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 623 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 440

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 655
  Number of unique weight vectors: 623

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (623, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 623 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 623 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.364, 0.619, 0.471, 0.600, 0.533] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 32 matches and 51 non-matches
    Purity of oracle classification:  0.614
    Entropy of oracle classification: 0.962
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 540 weight vectors
  Based on 32 matches and 51 non-matches
  Classified 146 matches and 394 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6144578313253012, 0.9618624139909456, 0.3855421686746988)
    (394, 0.6144578313253012, 0.9618624139909456, 0.3855421686746988)

Current size of match and non-match training data sets: 32 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 146 weight vectors
- Estimated match proportion 0.386

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.938, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 51 matches and 5 non-matches
    Purity of oracle classification:  0.911
    Entropy of oracle classification: 0.434
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)511_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 511), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)511_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 617
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 617 weight vectors
  Containing 186 true matches and 431 true non-matches
    (30.15% true matches)
  Identified 577 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   543  (94.11%)
          2 :    31  (5.37%)
          3 :     2  (0.35%)
          6 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 577 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.000 : 411

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 617
  Number of unique weight vectors: 577

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (577, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 577 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 577 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 495 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 157 matches and 338 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (338, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 157 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 157 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 47 matches and 11 non-matches
    Purity of oracle classification:  0.810
    Entropy of oracle classification: 0.701
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(15)481_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 481), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)481_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 727
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 727 weight vectors
  Containing 209 true matches and 518 true non-matches
    (28.75% true matches)
  Identified 693 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   676  (97.55%)
          2 :    14  (2.02%)
          3 :     2  (0.29%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 693 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 515

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 726
  Number of unique weight vectors: 693

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (693, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 693 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 693 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 609 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 142 matches and 467 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (467, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 467 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 467 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 3 matches and 72 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)833_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 833), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)833_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 724
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 724 weight vectors
  Containing 219 true matches and 505 true non-matches
    (30.25% true matches)
  Identified 688 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   672  (97.67%)
          2 :    13  (1.89%)
          3 :     2  (0.29%)
         20 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 688 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 504

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 723
  Number of unique weight vectors: 688

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (688, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 688 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 688 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 604 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 140 matches and 464 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (464, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 464 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 464 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.462, 0.609, 0.643, 0.706, 0.786] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.533, 0.667, 0.333, 0.714, 0.632] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 10 matches and 62 non-matches
    Purity of oracle classification:  0.861
    Entropy of oracle classification: 0.581
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)619_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (20, 1 - acm diverg, 619), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)619_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 728
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 728 weight vectors
  Containing 197 true matches and 531 true non-matches
    (27.06% true matches)
  Identified 704 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   687  (97.59%)
          2 :    14  (1.99%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 704 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.000 : 529

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 728
  Number of unique weight vectors: 704

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (704, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 704 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 704 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 620 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 131 matches and 489 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (489, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 131 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 49 matches and 4 non-matches
    Purity of oracle classification:  0.925
    Entropy of oracle classification: 0.386
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)200_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 200), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)200_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 607
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 607 weight vectors
  Containing 192 true matches and 415 true non-matches
    (31.63% true matches)
  Identified 571 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   554  (97.02%)
          2 :    14  (2.45%)
          3 :     2  (0.35%)
         19 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 571 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 158
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 412

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 606
  Number of unique weight vectors: 571

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (571, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 571 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 571 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 30 matches and 52 non-matches
    Purity of oracle classification:  0.634
    Entropy of oracle classification: 0.947
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 489 weight vectors
  Based on 30 matches and 52 non-matches
  Classified 132 matches and 357 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (132, 0.6341463414634146, 0.9474351361840306, 0.36585365853658536)
    (357, 0.6341463414634146, 0.9474351361840306, 0.36585365853658536)

Current size of match and non-match training data sets: 30 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 132 weight vectors
- Estimated match proportion 0.366

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 132 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)27_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 27), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)27_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 673
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 673 weight vectors
  Containing 181 true matches and 492 true non-matches
    (26.89% true matches)
  Identified 652 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   642  (98.47%)
          2 :     7  (1.07%)
          3 :     2  (0.31%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 652 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 160
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 491

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 672
  Number of unique weight vectors: 652

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (652, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 652 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 652 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 32 matches and 51 non-matches
    Purity of oracle classification:  0.614
    Entropy of oracle classification: 0.962
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 569 weight vectors
  Based on 32 matches and 51 non-matches
  Classified 132 matches and 437 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (132, 0.6144578313253012, 0.9618624139909456, 0.3855421686746988)
    (437, 0.6144578313253012, 0.9618624139909456, 0.3855421686746988)

Current size of match and non-match training data sets: 32 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 132 weight vectors
- Estimated match proportion 0.386

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 132 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 46 matches and 8 non-matches
    Purity of oracle classification:  0.852
    Entropy of oracle classification: 0.605
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)993_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 993), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)993_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1100
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1100 weight vectors
  Containing 227 true matches and 873 true non-matches
    (20.64% true matches)
  Identified 1043 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1006  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1043 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1099
  Number of unique weight vectors: 1043

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1043, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1043 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1043 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 955 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 846 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 846 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)993_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984375
recall                 0.210702
f-measure              0.347107
da                           64
dm                            0
ndm                           0
tp                           63
fp                            1
tn                  4.76529e+07
fn                          236
Name: (15, 1 - acm diverg, 993), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)993_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1012
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1012 weight vectors
  Containing 202 true matches and 810 true non-matches
    (19.96% true matches)
  Identified 962 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   928  (96.47%)
          2 :    31  (3.22%)
          3 :     2  (0.21%)
         16 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 962 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 789

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1011
  Number of unique weight vectors: 962

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (962, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 962 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 962 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 875 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 138 matches and 737 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (737, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 138 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 138 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 47 matches and 5 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.457
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

64.0
Analisando o arquivo: diverg(20)643_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 643), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)643_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)596_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 596), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)596_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 945
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 945 weight vectors
  Containing 219 true matches and 726 true non-matches
    (23.17% true matches)
  Identified 890 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   854  (95.96%)
          2 :    33  (3.71%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 890 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 705

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 944
  Number of unique weight vectors: 890

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (890, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 890 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 890 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 804 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 130 matches and 674 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (674, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 674 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 674 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 11 matches and 58 non-matches
    Purity of oracle classification:  0.841
    Entropy of oracle classification: 0.633
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)169_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.197324
f-measure              0.329609
da                           59
dm                            0
ndm                           0
tp                           59
fp                            0
tn                  4.76529e+07
fn                          240
Name: (10, 1 - acm diverg, 169), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)169_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 364
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 364 weight vectors
  Containing 193 true matches and 171 true non-matches
    (53.02% true matches)
  Identified 338 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   323  (95.56%)
          2 :    12  (3.55%)
          3 :     2  (0.59%)
         11 :     1  (0.30%)

Identified 1 non-pure unique weight vectors (from 338 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 168

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 363
  Number of unique weight vectors: 338

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (338, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 338 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 338 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 31 matches and 44 non-matches
    Purity of oracle classification:  0.587
    Entropy of oracle classification: 0.978
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 263 weight vectors
  Based on 31 matches and 44 non-matches
  Classified 140 matches and 123 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.5866666666666667, 0.9782176659354248, 0.41333333333333333)
    (123, 0.5866666666666667, 0.9782176659354248, 0.41333333333333333)

Current size of match and non-match training data sets: 31 / 44

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 140 weight vectors
- Estimated match proportion 0.413

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

59.0
Analisando o arquivo: diverg(10)67_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 67), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)67_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 499
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 499 weight vectors
  Containing 179 true matches and 320 true non-matches
    (35.87% true matches)
  Identified 473 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   461  (97.46%)
          2 :     9  (1.90%)
          3 :     2  (0.42%)
         14 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 473 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 153
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 319

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 498
  Number of unique weight vectors: 473

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (473, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 473 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 473 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 28 matches and 52 non-matches
    Purity of oracle classification:  0.650
    Entropy of oracle classification: 0.934
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 393 weight vectors
  Based on 28 matches and 52 non-matches
  Classified 130 matches and 263 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.65, 0.934068055375491, 0.35)
    (263, 0.65, 0.934068055375491, 0.35)

Current size of match and non-match training data sets: 28 / 52

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 130 weight vectors
- Estimated match proportion 0.350

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 47 matches and 5 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.457
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

65.0
Analisando o arquivo: diverg(15)207_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 207), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)207_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 817
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 817 weight vectors
  Containing 225 true matches and 592 true non-matches
    (27.54% true matches)
  Identified 760 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   723  (95.13%)
          2 :    34  (4.47%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 760 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 571

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 816
  Number of unique weight vectors: 760

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (760, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 760 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 25 matches and 60 non-matches
    Purity of oracle classification:  0.706
    Entropy of oracle classification: 0.874
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 675 weight vectors
  Based on 25 matches and 60 non-matches
  Classified 128 matches and 547 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (128, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)
    (547, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)

Current size of match and non-match training data sets: 25 / 60

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 547 weight vectors
- Estimated match proportion 0.294

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 547 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.423, 0.478, 0.500, 0.813, 0.545] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 17 matches and 52 non-matches
    Purity of oracle classification:  0.754
    Entropy of oracle classification: 0.805
    Number of true matches:      17
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)934_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (15, 1 - acm diverg, 934), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)934_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 996
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 996 weight vectors
  Containing 170 true matches and 826 true non-matches
    (17.07% true matches)
  Identified 959 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   928  (96.77%)
          2 :    28  (2.92%)
          3 :     2  (0.21%)
          6 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 959 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 153
     0.000 : 806

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 996
  Number of unique weight vectors: 959

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (959, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 959 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 959 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 872 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 42 matches and 830 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (42, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (830, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 42 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 28

Farthest first selection of 28 weight vectors from 42 vectors
  The selected farthest weight vectors are:
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [0.971, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.833, 1.000, 1.000, 0.935] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)

Perform oracle with 100.00 accuracy on 28 weight vectors
  The oracle will correctly classify 28 weight vectors and wrongly classify 0
  Classified 28 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 28 weight vectors (classified by oracle) from cluster

Cluster is pure enough and not too large, add its 42 weight vectors to:
  Match training set

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 3: Queue length: 1
  Number of manual oracle classifications performed: 115
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (830, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 67 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 830 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 830 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 16 matches and 56 non-matches
    Purity of oracle classification:  0.778
    Entropy of oracle classification: 0.764
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(15)676_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 676), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)676_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 917
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 917 weight vectors
  Containing 199 true matches and 718 true non-matches
    (21.70% true matches)
  Identified 872 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   838  (96.10%)
          2 :    31  (3.56%)
          3 :     2  (0.23%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 872 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 697

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 916
  Number of unique weight vectors: 872

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (872, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 872 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 872 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 786 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 125 matches and 661 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (125, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (661, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 661 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 661 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 13 matches and 58 non-matches
    Purity of oracle classification:  0.817
    Entropy of oracle classification: 0.687
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)934_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 934), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)934_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(10)412_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 412), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)412_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 409
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 409 weight vectors
  Containing 200 true matches and 209 true non-matches
    (48.90% true matches)
  Identified 383 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   369  (96.34%)
          2 :    11  (2.87%)
          3 :     2  (0.52%)
         12 :     1  (0.26%)

Identified 1 non-pure unique weight vectors (from 383 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 208

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 408
  Number of unique weight vectors: 383

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (383, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 383 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 383 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 40 matches and 37 non-matches
    Purity of oracle classification:  0.519
    Entropy of oracle classification: 0.999
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 306 weight vectors
  Based on 40 matches and 37 non-matches
  Classified 133 matches and 173 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.5194805194805194, 0.9989047442823606, 0.5194805194805194)
    (173, 0.5194805194805194, 0.9989047442823606, 0.5194805194805194)

Current size of match and non-match training data sets: 40 / 37

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 173 weight vectors
- Estimated match proportion 0.519

Sample size for this cluster: 62

Farthest first selection of 62 weight vectors from 173 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.717, 1.000, 0.240, 0.231, 0.065, 0.192, 0.184] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.800, 1.000, 0.259, 0.229, 0.214, 0.258, 0.156] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.913, 1.000, 0.184, 0.175, 0.087, 0.233, 0.167] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.715, 1.000, 0.214, 0.125, 0.270, 0.214, 0.167] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.758, 1.000, 0.300, 0.140, 0.135, 0.125, 0.148] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 62 weight vectors
  The oracle will correctly classify 62 weight vectors and wrongly classify 0
  Classified 8 matches and 54 non-matches
    Purity of oracle classification:  0.871
    Entropy of oracle classification: 0.555
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 62 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)532_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 532), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)532_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 370
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 370 weight vectors
  Containing 191 true matches and 179 true non-matches
    (51.62% true matches)
  Identified 349 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   335  (95.99%)
          2 :    11  (3.15%)
          3 :     2  (0.57%)
          7 :     1  (0.29%)

Identified 0 non-pure unique weight vectors (from 349 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.000 : 179

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 370
  Number of unique weight vectors: 349

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (349, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 349 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 349 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 39 matches and 36 non-matches
    Purity of oracle classification:  0.520
    Entropy of oracle classification: 0.999
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  36
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 274 weight vectors
  Based on 39 matches and 36 non-matches
  Classified 129 matches and 145 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (129, 0.52, 0.9988455359952018, 0.52)
    (145, 0.52, 0.9988455359952018, 0.52)

Current size of match and non-match training data sets: 39 / 36

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 129 weight vectors
- Estimated match proportion 0.520

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 129 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 51 matches and 4 non-matches
    Purity of oracle classification:  0.927
    Entropy of oracle classification: 0.376
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)343_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 343), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)343_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 865
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 865 weight vectors
  Containing 203 true matches and 662 true non-matches
    (23.47% true matches)
  Identified 816 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   782  (95.83%)
          2 :    31  (3.80%)
          3 :     2  (0.25%)
         15 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 816 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 641

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 864
  Number of unique weight vectors: 816

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (816, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 816 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 816 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 730 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 172 matches and 558 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (172, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (558, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 172 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 172 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.942, 1.000, 0.156, 0.172, 0.189, 0.148, 0.133] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 44 matches and 13 non-matches
    Purity of oracle classification:  0.772
    Entropy of oracle classification: 0.775
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  13
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)640_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 640), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)640_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 663
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 663 weight vectors
  Containing 194 true matches and 469 true non-matches
    (29.26% true matches)
  Identified 642 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   628  (97.82%)
          2 :    11  (1.71%)
          3 :     2  (0.31%)
          7 :     1  (0.16%)

Identified 0 non-pure unique weight vectors (from 642 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 469

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 663
  Number of unique weight vectors: 642

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (642, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 642 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 642 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 559 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 127 matches and 432 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (127, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (432, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 127 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 127 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 49 matches and 2 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.239
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)891_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979592
recall                  0.32107
f-measure              0.483627
da                           98
dm                            0
ndm                           0
tp                           96
fp                            2
tn                  4.76529e+07
fn                          203
Name: (15, 1 - acm diverg, 891), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)891_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 169 true matches and 784 true non-matches
    (17.73% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   885  (96.62%)
          2 :    28  (3.06%)
          3 :     2  (0.22%)
          6 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.000 : 764

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 953
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 120 matches and 709 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (120, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (709, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 709 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 709 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 3 matches and 72 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

98.0
Analisando o arquivo: diverg(10)588_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 588), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)588_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 836
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 836 weight vectors
  Containing 208 true matches and 628 true non-matches
    (24.88% true matches)
  Identified 789 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   754  (95.56%)
          2 :    32  (4.06%)
          3 :     2  (0.25%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 789 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 607

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 835
  Number of unique weight vectors: 789

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (789, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 789 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 789 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 25 matches and 60 non-matches
    Purity of oracle classification:  0.706
    Entropy of oracle classification: 0.874
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 704 weight vectors
  Based on 25 matches and 60 non-matches
  Classified 123 matches and 581 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)
    (581, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)

Current size of match and non-match training data sets: 25 / 60

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 581 weight vectors
- Estimated match proportion 0.294

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 581 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 16 matches and 54 non-matches
    Purity of oracle classification:  0.771
    Entropy of oracle classification: 0.776
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)114_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 114), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)114_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 118 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 118 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)38_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 38), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)38_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 657
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 657 weight vectors
  Containing 216 true matches and 441 true non-matches
    (32.88% true matches)
  Identified 624 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   608  (97.44%)
          2 :    13  (2.08%)
          3 :     2  (0.32%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 624 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 440

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 656
  Number of unique weight vectors: 624

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (624, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 624 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 624 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.364, 0.619, 0.471, 0.600, 0.533] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 31 matches and 52 non-matches
    Purity of oracle classification:  0.627
    Entropy of oracle classification: 0.953
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 541 weight vectors
  Based on 31 matches and 52 non-matches
  Classified 151 matches and 390 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)
    (390, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)

Current size of match and non-match training data sets: 31 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 151 weight vectors
- Estimated match proportion 0.373

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.933, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 50 matches and 6 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.491
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)71_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 71), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)71_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 790
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 790 weight vectors
  Containing 208 true matches and 582 true non-matches
    (26.33% true matches)
  Identified 761 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   744  (97.77%)
          2 :    14  (1.84%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 761 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 789
  Number of unique weight vectors: 761

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (761, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 761 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 761 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 676 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 133 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 543 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 543 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 12 matches and 61 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.645
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)3_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 3), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)3_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 810
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 810 weight vectors
  Containing 223 true matches and 587 true non-matches
    (27.53% true matches)
  Identified 756 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   719  (95.11%)
          2 :    34  (4.50%)
          3 :     2  (0.26%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 756 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 566

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 809
  Number of unique weight vectors: 756

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (756, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 756 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 671 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 94 matches and 577 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (577, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 577 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 577 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 20 matches and 53 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      20
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)348_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (10, 1 - acm diverg, 348), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)348_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 634
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 634 weight vectors
  Containing 166 true matches and 468 true non-matches
    (26.18% true matches)
  Identified 618 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   608  (98.38%)
          2 :     7  (1.13%)
          3 :     2  (0.32%)
          6 :     1  (0.16%)

Identified 0 non-pure unique weight vectors (from 618 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 150
     0.000 : 468

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 634
  Number of unique weight vectors: 618

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (618, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 618 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 618 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 535 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 109 matches and 426 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (426, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 109 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 44 matches and 4 non-matches
    Purity of oracle classification:  0.917
    Entropy of oracle classification: 0.414
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(10)123_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 123), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)123_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 606
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 606 weight vectors
  Containing 187 true matches and 419 true non-matches
    (30.86% true matches)
  Identified 566 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   532  (93.99%)
          2 :    31  (5.48%)
          3 :     2  (0.35%)
          6 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 566 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 399

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 606
  Number of unique weight vectors: 566

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (566, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 566 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 484 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 147 matches and 337 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (337, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 147 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 147 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 47 matches and 7 non-matches
    Purity of oracle classification:  0.870
    Entropy of oracle classification: 0.556
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)436_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 436), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)436_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 103 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 103 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 43 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)430_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 430), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)430_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 788
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 788 weight vectors
  Containing 208 true matches and 580 true non-matches
    (26.40% true matches)
  Identified 759 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   742  (97.76%)
          2 :    14  (1.84%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 759 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 577

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 787
  Number of unique weight vectors: 759

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (759, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 759 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 759 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 674 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 144 matches and 530 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (530, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 144 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 144 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 51 matches and 4 non-matches
    Purity of oracle classification:  0.927
    Entropy of oracle classification: 0.376
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)380_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 380), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)380_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)206_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 206), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)206_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 875
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 875 weight vectors
  Containing 189 true matches and 686 true non-matches
    (21.60% true matches)
  Identified 835 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   801  (95.93%)
          2 :    31  (3.71%)
          3 :     2  (0.24%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 835 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.000 : 666

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 875
  Number of unique weight vectors: 835

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (835, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 835 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 835 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 30 matches and 56 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 749 weight vectors
  Based on 30 matches and 56 non-matches
  Classified 168 matches and 581 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (168, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)
    (581, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)

Current size of match and non-match training data sets: 30 / 56

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 581 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 581 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.538, 0.789, 0.353, 0.545, 0.550] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.857, 0.417, 0.750, 0.500, 0.455] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(10)523_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990385
recall                 0.344482
f-measure              0.511166
da                          104
dm                            0
ndm                           0
tp                          103
fp                            1
tn                  4.76529e+07
fn                          196
Name: (10, 1 - acm diverg, 523), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)523_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 540
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 540 weight vectors
  Containing 151 true matches and 389 true non-matches
    (27.96% true matches)
  Identified 525 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   518  (98.67%)
          2 :     4  (0.76%)
          3 :     2  (0.38%)
          8 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 525 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 136
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 388

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 532
  Number of unique weight vectors: 524

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (524, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 524 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 524 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 27 matches and 54 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 443 weight vectors
  Based on 27 matches and 54 non-matches
  Classified 95 matches and 348 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (95, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (348, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 27 / 54

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 95 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 95 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 43 matches and 2 non-matches
    Purity of oracle classification:  0.956
    Entropy of oracle classification: 0.262
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

104.0
Analisando o arquivo: diverg(20)657_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 657), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)657_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)522_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 522), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)522_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 148 matches and 784 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (784, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 784 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 784 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 8 matches and 66 non-matches
    Purity of oracle classification:  0.892
    Entropy of oracle classification: 0.494
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)309_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987952
recall                 0.274247
f-measure              0.429319
da                           83
dm                            0
ndm                           0
tp                           82
fp                            1
tn                  4.76529e+07
fn                          217
Name: (10, 1 - acm diverg, 309), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)309_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 249
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 249 weight vectors
  Containing 165 true matches and 84 true non-matches
    (66.27% true matches)
  Identified 232 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   224  (96.55%)
          2 :     5  (2.16%)
          3 :     2  (0.86%)
          9 :     1  (0.43%)

Identified 1 non-pure unique weight vectors (from 232 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 83

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 240
  Number of unique weight vectors: 231

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (231, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 231 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 68

Perform initial selection using "far" method

Farthest first selection of 68 weight vectors from 231 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 35 matches and 33 non-matches
    Purity of oracle classification:  0.515
    Entropy of oracle classification: 0.999
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  33
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 163 weight vectors
  Based on 35 matches and 33 non-matches
  Classified 113 matches and 50 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 68
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (113, 0.5147058823529411, 0.9993759069576514, 0.5147058823529411)
    (50, 0.5147058823529411, 0.9993759069576514, 0.5147058823529411)

Current size of match and non-match training data sets: 35 / 33

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 113 weight vectors
- Estimated match proportion 0.515

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 113 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.833, 1.000, 1.000, 0.935] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 47 matches and 5 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.457
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

83.0
Analisando o arquivo: diverg(20)94_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 94), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)94_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1091
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1091 weight vectors
  Containing 226 true matches and 865 true non-matches
    (20.71% true matches)
  Identified 1034 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   997  (96.42%)
          2 :    34  (3.29%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1034 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 844

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1090
  Number of unique weight vectors: 1034

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1034, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1034 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1034 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 946 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 815 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (815, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 131 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)982_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990566
recall                 0.351171
f-measure              0.518519
da                          106
dm                            0
ndm                           0
tp                          105
fp                            1
tn                  4.76529e+07
fn                          194
Name: (15, 1 - acm diverg, 982), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)982_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 905
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 905 weight vectors
  Containing 154 true matches and 751 true non-matches
    (17.02% true matches)
  Identified 869 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   841  (96.78%)
          2 :    25  (2.88%)
          3 :     2  (0.23%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 869 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 730

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 897
  Number of unique weight vectors: 868

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (868, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 868 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 868 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 23 matches and 63 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.838
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 782 weight vectors
  Based on 23 matches and 63 non-matches
  Classified 68 matches and 714 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (68, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)
    (714, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)

Current size of match and non-match training data sets: 23 / 63

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.84
- Size 68 weight vectors
- Estimated match proportion 0.267

Sample size for this cluster: 36

Farthest first selection of 36 weight vectors from 68 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)

Perform oracle with 100.00 accuracy on 36 weight vectors
  The oracle will correctly classify 36 weight vectors and wrongly classify 0
  Classified 36 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 36 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

106.0
Analisando o arquivo: diverg(10)454_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 454), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)454_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 526
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 526 weight vectors
  Containing 224 true matches and 302 true non-matches
    (42.59% true matches)
  Identified 487 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   468  (96.10%)
          2 :    16  (3.29%)
          3 :     2  (0.41%)
         20 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 487 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 299

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 525
  Number of unique weight vectors: 487

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (487, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 487 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 487 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 35 matches and 45 non-matches
    Purity of oracle classification:  0.562
    Entropy of oracle classification: 0.989
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 407 weight vectors
  Based on 35 matches and 45 non-matches
  Classified 172 matches and 235 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (172, 0.5625, 0.9886994082884974, 0.4375)
    (235, 0.5625, 0.9886994082884974, 0.4375)

Current size of match and non-match training data sets: 35 / 45

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 172 weight vectors
- Estimated match proportion 0.438

Sample size for this cluster: 61

Farthest first selection of 61 weight vectors from 172 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.821, 1.000, 0.275, 0.297, 0.227, 0.255, 0.152] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 61 weight vectors
  The oracle will correctly classify 61 weight vectors and wrongly classify 0
  Classified 45 matches and 16 non-matches
    Purity of oracle classification:  0.738
    Entropy of oracle classification: 0.830
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  16
    Number of false non-matches: 0

Deleted 61 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)988_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 988), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)988_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)141_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 141), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)141_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 825
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 825 weight vectors
  Containing 219 true matches and 606 true non-matches
    (26.55% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   733  (95.32%)
          2 :    33  (4.29%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 585

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 824
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 31 matches and 54 non-matches
    Purity of oracle classification:  0.635
    Entropy of oracle classification: 0.947
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 31 matches and 54 non-matches
  Classified 325 matches and 359 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (325, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)
    (359, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)

Current size of match and non-match training data sets: 31 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.95
- Size 325 weight vectors
- Estimated match proportion 0.365

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 325 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 42 matches and 28 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)469_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (15, 1 - acm diverg, 469), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)469_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 929
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 929 weight vectors
  Containing 178 true matches and 751 true non-matches
    (19.16% true matches)
  Identified 890 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   860  (96.63%)
          2 :    27  (3.03%)
          3 :     2  (0.22%)
          9 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 890 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 730

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 920
  Number of unique weight vectors: 889

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (889, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 889 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 889 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 23 matches and 63 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.838
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 803 weight vectors
  Based on 23 matches and 63 non-matches
  Classified 89 matches and 714 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (89, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)
    (714, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)

Current size of match and non-match training data sets: 23 / 63

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.84
- Size 714 weight vectors
- Estimated match proportion 0.267

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 714 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(15)921_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 921), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)921_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 712
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 712 weight vectors
  Containing 217 true matches and 495 true non-matches
    (30.48% true matches)
  Identified 657 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   621  (94.52%)
          2 :    33  (5.02%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 657 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 474

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 711
  Number of unique weight vectors: 657

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (657, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 657 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 657 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 573 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 164 matches and 409 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (164, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (409, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 409 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 409 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [1.000, 0.000, 0.700, 0.429, 0.476, 0.647, 0.810] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 2 matches and 69 non-matches
    Purity of oracle classification:  0.972
    Entropy of oracle classification: 0.185
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)955_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 955), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)955_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 907
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 907 weight vectors
  Containing 200 true matches and 707 true non-matches
    (22.05% true matches)
  Identified 862 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   828  (96.06%)
          2 :    31  (3.60%)
          3 :     2  (0.23%)
         11 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 862 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 686

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 906
  Number of unique weight vectors: 862

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (862, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 862 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 862 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 776 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 154 matches and 622 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (154, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (622, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 154 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 154 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 46 matches and 9 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.643
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)667_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.977099
recall                 0.428094
f-measure              0.595349
da                          131
dm                            0
ndm                           0
tp                          128
fp                            3
tn                  4.76529e+07
fn                          171
Name: (10, 1 - acm diverg, 667), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)667_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 151
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 151 weight vectors
  Containing 116 true matches and 35 true non-matches
    (76.82% true matches)
  Identified 141 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   134  (95.04%)
          2 :     4  (2.84%)
          3 :     3  (2.13%)

Identified 0 non-pure unique weight vectors (from 141 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 106
     0.000 : 35

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 151
  Number of unique weight vectors: 141

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 141 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 57

Perform initial selection using "far" method

Farthest first selection of 57 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 35 matches and 22 non-matches
    Purity of oracle classification:  0.614
    Entropy of oracle classification: 0.962
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  22
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 84 weight vectors
  Based on 35 matches and 22 non-matches
  Classified 84 matches and 0 non-matches

131.0
Analisando o arquivo: diverg(15)742_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 742), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)742_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 752
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 752 weight vectors
  Containing 204 true matches and 548 true non-matches
    (27.13% true matches)
  Identified 723 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   706  (97.65%)
          2 :    14  (1.94%)
          3 :     2  (0.28%)
         12 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 723 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 545

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 751
  Number of unique weight vectors: 723

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (723, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 723 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 723 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 35 matches and 50 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.977
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 638 weight vectors
  Based on 35 matches and 50 non-matches
  Classified 308 matches and 330 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (308, 0.5882352941176471, 0.9774178175281716, 0.4117647058823529)
    (330, 0.5882352941176471, 0.9774178175281716, 0.4117647058823529)

Current size of match and non-match training data sets: 35 / 50

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 308 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 308 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.890, 1.000, 0.281, 0.136, 0.183, 0.250, 0.163] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 45 matches and 26 non-matches
    Purity of oracle classification:  0.634
    Entropy of oracle classification: 0.948
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)919_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 919), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)919_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(20)543_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 543), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)543_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)819_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 819), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)819_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 292
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 292 weight vectors
  Containing 207 true matches and 85 true non-matches
    (70.89% true matches)
  Identified 259 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   245  (94.59%)
          2 :    11  (4.25%)
          3 :     2  (0.77%)
         19 :     1  (0.39%)

Identified 1 non-pure unique weight vectors (from 259 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 84

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 291
  Number of unique weight vectors: 259

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (259, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 259 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 70

Perform initial selection using "far" method

Farthest first selection of 70 weight vectors from 259 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 38 matches and 32 non-matches
    Purity of oracle classification:  0.543
    Entropy of oracle classification: 0.995
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  32
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 189 weight vectors
  Based on 38 matches and 32 non-matches
  Classified 145 matches and 44 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 70
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.5428571428571428, 0.9946937953613058, 0.5428571428571428)
    (44, 0.5428571428571428, 0.9946937953613058, 0.5428571428571428)

Current size of match and non-match training data sets: 38 / 32

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 0.99
- Size 145 weight vectors
- Estimated match proportion 0.543

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 49 matches and 9 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.623
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)569_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990196
recall                 0.337793
f-measure              0.503741
da                          102
dm                            0
ndm                           0
tp                          101
fp                            1
tn                  4.76529e+07
fn                          198
Name: (10, 1 - acm diverg, 569), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)569_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 462
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 462 weight vectors
  Containing 162 true matches and 300 true non-matches
    (35.06% true matches)
  Identified 441 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   430  (97.51%)
          2 :     8  (1.81%)
          3 :     2  (0.45%)
         10 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 441 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 143
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 297

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 461
  Number of unique weight vectors: 441

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (441, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 441 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 441 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 35 matches and 44 non-matches
    Purity of oracle classification:  0.557
    Entropy of oracle classification: 0.991
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 362 weight vectors
  Based on 35 matches and 44 non-matches
  Classified 122 matches and 240 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (122, 0.5569620253164557, 0.9906174973781801, 0.4430379746835443)
    (240, 0.5569620253164557, 0.9906174973781801, 0.4430379746835443)

Current size of match and non-match training data sets: 35 / 44

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 240 weight vectors
- Estimated match proportion 0.443

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 240 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.600, 0.944, 0.250, 0.200, 0.186, 0.136, 0.118] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.818, 0.727, 0.438, 0.375, 0.400] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.821, 1.000, 0.275, 0.297, 0.227, 0.255, 0.152] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 2 matches and 66 non-matches
    Purity of oracle classification:  0.971
    Entropy of oracle classification: 0.191
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

102.0
Analisando o arquivo: diverg(20)507_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 507), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)507_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 701
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 701 weight vectors
  Containing 219 true matches and 482 true non-matches
    (31.24% true matches)
  Identified 646 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   610  (94.43%)
          2 :    33  (5.11%)
          3 :     2  (0.31%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 646 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 461

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 700
  Number of unique weight vectors: 646

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (646, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 646 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 646 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 563 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 157 matches and 406 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (406, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 406 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 406 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 3 matches and 68 non-matches
    Purity of oracle classification:  0.958
    Entropy of oracle classification: 0.253
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)214_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990566
recall                 0.351171
f-measure              0.518519
da                          106
dm                            0
ndm                           0
tp                          105
fp                            1
tn                  4.76529e+07
fn                          194
Name: (10, 1 - acm diverg, 214), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)214_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 880
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 880 weight vectors
  Containing 154 true matches and 726 true non-matches
    (17.50% true matches)
  Identified 844 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   816  (96.68%)
          2 :    25  (2.96%)
          3 :     2  (0.24%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 844 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 705

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 872
  Number of unique weight vectors: 843

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (843, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 843 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 843 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 757 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 73 matches and 684 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (73, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (684, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 684 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 684 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 14 matches and 55 non-matches
    Purity of oracle classification:  0.797
    Entropy of oracle classification: 0.728
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

106.0
Analisando o arquivo: diverg(20)829_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 829), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)829_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)253_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 253), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)253_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 320
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 320 weight vectors
  Containing 187 true matches and 133 true non-matches
    (58.44% true matches)
  Identified 298 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   282  (94.63%)
          2 :    13  (4.36%)
          3 :     2  (0.67%)
          6 :     1  (0.34%)

Identified 0 non-pure unique weight vectors (from 298 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 131

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 320
  Number of unique weight vectors: 298

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (298, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 298 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 73

Perform initial selection using "far" method

Farthest first selection of 73 weight vectors from 298 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 33 matches and 40 non-matches
    Purity of oracle classification:  0.548
    Entropy of oracle classification: 0.993
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  40
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 225 weight vectors
  Based on 33 matches and 40 non-matches
  Classified 135 matches and 90 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 73
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.547945205479452, 0.9933570282728468, 0.4520547945205479)
    (90, 0.547945205479452, 0.9933570282728468, 0.4520547945205479)

Current size of match and non-match training data sets: 33 / 40

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 90 weight vectors
- Estimated match proportion 0.452

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 90 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 5 matches and 42 non-matches
    Purity of oracle classification:  0.894
    Entropy of oracle classification: 0.489
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  42
    Number of false non-matches: 0

Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)839_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 839), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)839_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)621_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.977778
recall                 0.441472
f-measure              0.608295
da                          135
dm                            0
ndm                           0
tp                          132
fp                            3
tn                  4.76529e+07
fn                          167
Name: (10, 1 - acm diverg, 621), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)621_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 665
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 665 weight vectors
  Containing 132 true matches and 533 true non-matches
    (19.85% true matches)
  Identified 633 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   604  (95.42%)
          2 :    26  (4.11%)
          3 :     3  (0.47%)

Identified 0 non-pure unique weight vectors (from 633 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 120
     0.000 : 513

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 665
  Number of unique weight vectors: 633

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (633, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 633 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 633 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 26 matches and 57 non-matches
    Purity of oracle classification:  0.687
    Entropy of oracle classification: 0.897
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 550 weight vectors
  Based on 26 matches and 57 non-matches
  Classified 112 matches and 438 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)
    (438, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)

Current size of match and non-match training data sets: 26 / 57

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 112 weight vectors
- Estimated match proportion 0.313

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.971, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 35 matches and 13 non-matches
    Purity of oracle classification:  0.729
    Entropy of oracle classification: 0.843
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  13
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

135.0
Analisando o arquivo: diverg(20)605_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 605), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)605_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 961
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 961 weight vectors
  Containing 217 true matches and 744 true non-matches
    (22.58% true matches)
  Identified 906 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   870  (96.03%)
          2 :    33  (3.64%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 906 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 723

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 960
  Number of unique weight vectors: 906

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (906, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 906 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 906 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 819 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 151 matches and 668 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (668, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 151 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)366_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 366), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)366_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 202
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 202 weight vectors
  Containing 173 true matches and 29 true non-matches
    (85.64% true matches)
  Identified 183 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   170  (92.90%)
          2 :    10  (5.46%)
          3 :     2  (1.09%)
          6 :     1  (0.55%)

Identified 0 non-pure unique weight vectors (from 183 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 154
     0.000 : 29

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 202
  Number of unique weight vectors: 183

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (183, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 183 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 63

Perform initial selection using "far" method

Farthest first selection of 63 weight vectors from 183 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.344, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 63 weight vectors
  The oracle will correctly classify 63 weight vectors and wrongly classify 0
  Classified 41 matches and 22 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  22
    Number of false non-matches: 0

Deleted 63 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 120 weight vectors
  Based on 41 matches and 22 non-matches
  Classified 120 matches and 0 non-matches

69.0
Analisando o arquivo: diverg(10)918_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 918), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)918_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 908
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 908 weight vectors
  Containing 200 true matches and 708 true non-matches
    (22.03% true matches)
  Identified 863 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   829  (96.06%)
          2 :    31  (3.59%)
          3 :     2  (0.23%)
         11 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 863 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 687

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 907
  Number of unique weight vectors: 863

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (863, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 863 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 863 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 777 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 155 matches and 622 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (622, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 155 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 155 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 46 matches and 9 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.643
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)247_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990566
recall                 0.351171
f-measure              0.518519
da                          106
dm                            0
ndm                           0
tp                          105
fp                            1
tn                  4.76529e+07
fn                          194
Name: (10, 1 - acm diverg, 247), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)247_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 664
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 664 weight vectors
  Containing 154 true matches and 510 true non-matches
    (23.19% true matches)
  Identified 628 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   600  (95.54%)
          2 :    25  (3.98%)
          3 :     2  (0.32%)
          8 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 628 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 489

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 656
  Number of unique weight vectors: 627

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (627, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 627 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 627 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 26 matches and 57 non-matches
    Purity of oracle classification:  0.687
    Entropy of oracle classification: 0.897
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 544 weight vectors
  Based on 26 matches and 57 non-matches
  Classified 94 matches and 450 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)
    (450, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)

Current size of match and non-match training data sets: 26 / 57

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 94 weight vectors
- Estimated match proportion 0.313

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 94 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 42 matches and 2 non-matches
    Purity of oracle classification:  0.955
    Entropy of oracle classification: 0.267
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

106.0
Analisando o arquivo: diverg(10)736_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 736), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)736_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 451
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 451 weight vectors
  Containing 195 true matches and 256 true non-matches
    (43.24% true matches)
  Identified 427 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   410  (96.02%)
          2 :    14  (3.28%)
          3 :     2  (0.47%)
          7 :     1  (0.23%)

Identified 0 non-pure unique weight vectors (from 427 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 254

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 451
  Number of unique weight vectors: 427

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (427, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 427 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 427 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 36 matches and 42 non-matches
    Purity of oracle classification:  0.538
    Entropy of oracle classification: 0.996
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  42
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 349 weight vectors
  Based on 36 matches and 42 non-matches
  Classified 140 matches and 209 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.5384615384615384, 0.9957274520849256, 0.46153846153846156)
    (209, 0.5384615384615384, 0.9957274520849256, 0.46153846153846156)

Current size of match and non-match training data sets: 36 / 42

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 1.00
- Size 209 weight vectors
- Estimated match proportion 0.462

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 209 vectors
  The selected farthest weight vectors are:
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 4 matches and 62 non-matches
    Purity of oracle classification:  0.939
    Entropy of oracle classification: 0.330
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)753_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 753), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)753_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 611
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 611 weight vectors
  Containing 191 true matches and 420 true non-matches
    (31.26% true matches)
  Identified 585 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   570  (97.44%)
          2 :    12  (2.05%)
          3 :     2  (0.34%)
         11 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 585 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 417

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 610
  Number of unique weight vectors: 585

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (585, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 585 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 585 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 503 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 142 matches and 361 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (361, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 142 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 50 matches and 6 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.491
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)427_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 427), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)427_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 484
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 484 weight vectors
  Containing 181 true matches and 303 true non-matches
    (37.40% true matches)
  Identified 458 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   447  (97.60%)
          2 :     8  (1.75%)
          3 :     2  (0.44%)
         15 :     1  (0.22%)

Identified 1 non-pure unique weight vectors (from 458 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 155
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 302

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 483
  Number of unique weight vectors: 458

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (458, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 458 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 458 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.786, 0.833, 0.545, 0.478, 0.346] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 27 matches and 52 non-matches
    Purity of oracle classification:  0.658
    Entropy of oracle classification: 0.927
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 379 weight vectors
  Based on 27 matches and 52 non-matches
  Classified 136 matches and 243 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6582278481012658, 0.9265044456232998, 0.34177215189873417)
    (243, 0.6582278481012658, 0.9265044456232998, 0.34177215189873417)

Current size of match and non-match training data sets: 27 / 52

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 243 weight vectors
- Estimated match proportion 0.342

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 243 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.750, 0.905, 0.667, 0.500, 0.571] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [1.000, 0.000, 0.333, 0.667, 0.400, 0.583, 0.563] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.545, 0.667, 0.571, 0.350, 0.600] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.667, 0.722, 0.353, 0.545, 0.800] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.433, 0.737, 0.706, 0.500, 0.800] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.500, 0.739, 0.824, 0.591, 0.550] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 0 matches and 64 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)521_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 521), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)521_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1027
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1027 weight vectors
  Containing 223 true matches and 804 true non-matches
    (21.71% true matches)
  Identified 973 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   936  (96.20%)
          2 :    34  (3.49%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 973 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 783

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 973

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (973, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 973 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 973 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 886 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 131 matches and 755 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (755, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 755 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 755 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)784_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 784), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)784_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1050
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1050 weight vectors
  Containing 208 true matches and 842 true non-matches
    (19.81% true matches)
  Identified 1003 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   968  (96.51%)
          2 :    32  (3.19%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1003 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 821

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1049
  Number of unique weight vectors: 1003

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1003, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1003 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1003 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 916 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (793, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 793 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 12 matches and 58 non-matches
    Purity of oracle classification:  0.829
    Entropy of oracle classification: 0.661
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)250_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (15, 1 - acm diverg, 250), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)250_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 690
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 690 weight vectors
  Containing 178 true matches and 512 true non-matches
    (25.80% true matches)
  Identified 651 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   621  (95.39%)
          2 :    27  (4.15%)
          3 :     2  (0.31%)
          9 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 651 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 491

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 681
  Number of unique weight vectors: 650

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (650, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 650 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 650 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 26 matches and 57 non-matches
    Purity of oracle classification:  0.687
    Entropy of oracle classification: 0.897
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 567 weight vectors
  Based on 26 matches and 57 non-matches
  Classified 115 matches and 452 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (115, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)
    (452, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)

Current size of match and non-match training data sets: 26 / 57

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 115 weight vectors
- Estimated match proportion 0.313

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 115 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 46 matches and 2 non-matches
    Purity of oracle classification:  0.958
    Entropy of oracle classification: 0.250
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(10)302_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 302), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)302_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 446
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 446 weight vectors
  Containing 200 true matches and 246 true non-matches
    (44.84% true matches)
  Identified 414 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   398  (96.14%)
          2 :    13  (3.14%)
          3 :     2  (0.48%)
         16 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 414 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 243

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 445
  Number of unique weight vectors: 414

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (414, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 414 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 414 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.833, 0.550, 0.500, 0.688] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 39 matches and 39 non-matches
    Purity of oracle classification:  0.500
    Entropy of oracle classification: 1.000
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 336 weight vectors
  Based on 39 matches and 39 non-matches
  Classified 273 matches and 63 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (273, 0.5, 1.0, 0.5)
    (63, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 39 / 39

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 273 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 273 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.913, 1.000, 0.184, 0.175, 0.087, 0.233, 0.167] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 41 matches and 30 non-matches
    Purity of oracle classification:  0.577
    Entropy of oracle classification: 0.983
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  30
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(15)560_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981481
recall                 0.177258
f-measure              0.300283
da                           54
dm                            0
ndm                           0
tp                           53
fp                            1
tn                  4.76529e+07
fn                          246
Name: (15, 1 - acm diverg, 560), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)560_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 829
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 829 weight vectors
  Containing 212 true matches and 617 true non-matches
    (25.57% true matches)
  Identified 775 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   740  (95.48%)
          2 :    32  (4.13%)
          3 :     2  (0.26%)
         19 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 775 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 596

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 828
  Number of unique weight vectors: 775

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (775, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 775 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 775 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 690 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 148 matches and 542 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (542, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 542 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 542 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 4 matches and 69 non-matches
    Purity of oracle classification:  0.945
    Entropy of oracle classification: 0.306
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

54.0
Analisando o arquivo: diverg(20)848_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 848), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)848_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 667
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 667 weight vectors
  Containing 217 true matches and 450 true non-matches
    (32.53% true matches)
  Identified 630 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   612  (97.14%)
          2 :    15  (2.38%)
          3 :     2  (0.32%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 630 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 447

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 666
  Number of unique weight vectors: 630

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (630, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 630 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 630 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 26 matches and 57 non-matches
    Purity of oracle classification:  0.687
    Entropy of oracle classification: 0.897
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 547 weight vectors
  Based on 26 matches and 57 non-matches
  Classified 133 matches and 414 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)
    (414, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)

Current size of match and non-match training data sets: 26 / 57

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 133 weight vectors
- Estimated match proportion 0.313

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 133 vectors
  The selected farthest weight vectors are:
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 49 matches and 2 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.239
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)508_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 508), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)508_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 913
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 913 weight vectors
  Containing 204 true matches and 709 true non-matches
    (22.34% true matches)
  Identified 862 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   828  (96.06%)
          2 :    31  (3.60%)
          3 :     2  (0.23%)
         17 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 862 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 688

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 912
  Number of unique weight vectors: 862

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (862, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 862 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 862 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 776 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 146 matches and 630 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (630, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 630 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 630 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 3 matches and 71 non-matches
    Purity of oracle classification:  0.959
    Entropy of oracle classification: 0.245
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)875_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 875), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)875_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 645
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 645 weight vectors
  Containing 215 true matches and 430 true non-matches
    (33.33% true matches)
  Identified 593 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   557  (93.93%)
          2 :    33  (5.56%)
          3 :     2  (0.34%)
         16 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 593 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 409

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 644
  Number of unique weight vectors: 593

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (593, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 593 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 593 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 28 matches and 54 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 511 weight vectors
  Based on 28 matches and 54 non-matches
  Classified 146 matches and 365 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)
    (365, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)

Current size of match and non-match training data sets: 28 / 54

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 365 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 365 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.714, 0.727, 0.750, 0.294, 0.833] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 10 matches and 60 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)297_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 297), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)297_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 465
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 465 weight vectors
  Containing 197 true matches and 268 true non-matches
    (42.37% true matches)
  Identified 441 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   424  (96.15%)
          2 :    14  (3.17%)
          3 :     2  (0.45%)
          7 :     1  (0.23%)

Identified 0 non-pure unique weight vectors (from 441 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.000 : 266

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 465
  Number of unique weight vectors: 441

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (441, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 441 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 441 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 33 matches and 46 non-matches
    Purity of oracle classification:  0.582
    Entropy of oracle classification: 0.980
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 362 weight vectors
  Based on 33 matches and 46 non-matches
  Classified 135 matches and 227 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.5822784810126582, 0.980377508715691, 0.4177215189873418)
    (227, 0.5822784810126582, 0.980377508715691, 0.4177215189873418)

Current size of match and non-match training data sets: 33 / 46

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 227 weight vectors
- Estimated match proportion 0.418

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 227 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 7 matches and 59 non-matches
    Purity of oracle classification:  0.894
    Entropy of oracle classification: 0.488
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)188_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 188), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)188_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 586
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 586 weight vectors
  Containing 186 true matches and 400 true non-matches
    (31.74% true matches)
  Identified 546 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   512  (93.77%)
          2 :    31  (5.68%)
          3 :     2  (0.37%)
          6 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 546 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.000 : 380

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 586
  Number of unique weight vectors: 546

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (546, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 546 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 546 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 32 matches and 49 non-matches
    Purity of oracle classification:  0.605
    Entropy of oracle classification: 0.968
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 465 weight vectors
  Based on 32 matches and 49 non-matches
  Classified 156 matches and 309 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6049382716049383, 0.9679884922470297, 0.3950617283950617)
    (309, 0.6049382716049383, 0.9679884922470297, 0.3950617283950617)

Current size of match and non-match training data sets: 32 / 49

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 309 weight vectors
- Estimated match proportion 0.395

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 309 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.556, 0.348, 0.467, 0.636, 0.412] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.800, 0.667, 0.381, 0.550, 0.429] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.800, 0.000, 0.444, 0.545, 0.333, 0.111, 0.533] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.462, 0.667, 0.636, 0.368, 0.500] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(10)292_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 292), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)292_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 700
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 700 weight vectors
  Containing 214 true matches and 486 true non-matches
    (30.57% true matches)
  Identified 665 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   650  (97.74%)
          2 :    12  (1.80%)
          3 :     2  (0.30%)
         20 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 665 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 485

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 699
  Number of unique weight vectors: 665

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (665, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 665 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 665 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 35 matches and 49 non-matches
    Purity of oracle classification:  0.583
    Entropy of oracle classification: 0.980
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 581 weight vectors
  Based on 35 matches and 49 non-matches
  Classified 252 matches and 329 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (252, 0.5833333333333334, 0.9798687566511527, 0.4166666666666667)
    (329, 0.5833333333333334, 0.9798687566511527, 0.4166666666666667)

Current size of match and non-match training data sets: 35 / 49

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 252 weight vectors
- Estimated match proportion 0.417

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 252 vectors
  The selected farthest weight vectors are:
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.800, 1.000, 0.211, 0.133, 0.074, 0.133, 0.185] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 46 matches and 22 non-matches
    Purity of oracle classification:  0.676
    Entropy of oracle classification: 0.908
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  22
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)746_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (15, 1 - acm diverg, 746), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)746_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 800
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 800 weight vectors
  Containing 167 true matches and 633 true non-matches
    (20.88% true matches)
  Identified 761 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   732  (96.19%)
          2 :    26  (3.42%)
          3 :     2  (0.26%)
         10 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 761 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 612

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 799
  Number of unique weight vectors: 761

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (761, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 761 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 761 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 676 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 89 matches and 587 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (89, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (587, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 587 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 587 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 16 matches and 55 non-matches
    Purity of oracle classification:  0.775
    Entropy of oracle classification: 0.770
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(15)910_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 910), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)910_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 860
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 860 weight vectors
  Containing 227 true matches and 633 true non-matches
    (26.40% true matches)
  Identified 803 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   766  (95.39%)
          2 :    34  (4.23%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 803 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 612

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 859
  Number of unique weight vectors: 803

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (803, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 803 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 803 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 718 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 155 matches and 563 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (563, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 563 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 563 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 5 matches and 69 non-matches
    Purity of oracle classification:  0.932
    Entropy of oracle classification: 0.357
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)657_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 657), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)657_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 882
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 882 weight vectors
  Containing 212 true matches and 670 true non-matches
    (24.04% true matches)
  Identified 830 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   794  (95.66%)
          2 :    33  (3.98%)
          3 :     2  (0.24%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 830 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 649

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 881
  Number of unique weight vectors: 830

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (830, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 830 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 830 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 744 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 163 matches and 581 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (163, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (581, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 163 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 163 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 47 matches and 9 non-matches
    Purity of oracle classification:  0.839
    Entropy of oracle classification: 0.636
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)564_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 564), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)564_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1035
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1035 weight vectors
  Containing 223 true matches and 812 true non-matches
    (21.55% true matches)
  Identified 981 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   944  (96.23%)
          2 :    34  (3.47%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 981 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 791

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1034
  Number of unique weight vectors: 981

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (981, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 981 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 981 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 894 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 156 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (738, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 156 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)771_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 771), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)771_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 530
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 530 weight vectors
  Containing 208 true matches and 322 true non-matches
    (39.25% true matches)
  Identified 501 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   484  (96.61%)
          2 :    14  (2.79%)
          3 :     2  (0.40%)
         12 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 501 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 319

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 529
  Number of unique weight vectors: 501

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (501, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 501 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 501 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 32 matches and 48 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 421 weight vectors
  Based on 32 matches and 48 non-matches
  Classified 140 matches and 281 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6, 0.9709505944546686, 0.4)
    (281, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 32 / 48

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 140 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 53 matches and 3 non-matches
    Purity of oracle classification:  0.946
    Entropy of oracle classification: 0.301
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)138_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 138), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)138_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 432
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 432 weight vectors
  Containing 184 true matches and 248 true non-matches
    (42.59% true matches)
  Identified 411 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   401  (97.57%)
          2 :     7  (1.70%)
          3 :     2  (0.49%)
         11 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 411 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 163
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 247

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 431
  Number of unique weight vectors: 411

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (411, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 411 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 411 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 33 matches and 45 non-matches
    Purity of oracle classification:  0.577
    Entropy of oracle classification: 0.983
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 333 weight vectors
  Based on 33 matches and 45 non-matches
  Classified 124 matches and 209 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (124, 0.5769230769230769, 0.9828586897127056, 0.4230769230769231)
    (209, 0.5769230769230769, 0.9828586897127056, 0.4230769230769231)

Current size of match and non-match training data sets: 33 / 45

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 124 weight vectors
- Estimated match proportion 0.423

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 124 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 47 matches and 7 non-matches
    Purity of oracle classification:  0.870
    Entropy of oracle classification: 0.556
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(15)217_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (15, 1 - acm diverg, 217), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)217_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 863
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 863 weight vectors
  Containing 160 true matches and 703 true non-matches
    (18.54% true matches)
  Identified 829 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   799  (96.38%)
          2 :    27  (3.26%)
          3 :     2  (0.24%)
          4 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 829 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 146
     0.000 : 683

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 863
  Number of unique weight vectors: 829

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (829, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 829 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 829 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 26 matches and 60 non-matches
    Purity of oracle classification:  0.698
    Entropy of oracle classification: 0.884
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 743 weight vectors
  Based on 26 matches and 60 non-matches
  Classified 94 matches and 649 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6976744186046512, 0.8841151220488478, 0.3023255813953488)
    (649, 0.6976744186046512, 0.8841151220488478, 0.3023255813953488)

Current size of match and non-match training data sets: 26 / 60

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 94 weight vectors
- Estimated match proportion 0.302

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 94 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(15)922_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 922), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)922_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 141 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)500_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 500), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)500_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1043
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1043 weight vectors
  Containing 222 true matches and 821 true non-matches
    (21.28% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   952  (96.26%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 800

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1042
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 145 matches and 757 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (757, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 145 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)41_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 41), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)41_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)527_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 527), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)527_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)96_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 96), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)96_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 190
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 190 weight vectors
  Containing 143 true matches and 47 true non-matches
    (75.26% true matches)
  Identified 178 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   170  (95.51%)
          2 :     5  (2.81%)
          3 :     2  (1.12%)
          4 :     1  (0.56%)

Identified 0 non-pure unique weight vectors (from 178 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 131
     0.000 : 47

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 190
  Number of unique weight vectors: 178

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (178, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 178 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 62

Perform initial selection using "far" method

Farthest first selection of 62 weight vectors from 178 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 62 weight vectors
  The oracle will correctly classify 62 weight vectors and wrongly classify 0
  Classified 32 matches and 30 non-matches
    Purity of oracle classification:  0.516
    Entropy of oracle classification: 0.999
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  30
    Number of false non-matches: 0

Deleted 62 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 116 weight vectors
  Based on 32 matches and 30 non-matches
  Classified 108 matches and 8 non-matches

  Non-match cluster not large enough for required sample size
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 1
  Number of manual oracle classifications performed: 62
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (108, 0.5161290322580645, 0.9992492479956565, 0.5161290322580645)

Current size of match and non-match training data sets: 32 / 30

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 108 weight vectors
- Estimated match proportion 0.516

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 108 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 42 matches and 9 non-matches
    Purity of oracle classification:  0.824
    Entropy of oracle classification: 0.672
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(10)265_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 265), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)265_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 691
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 691 weight vectors
  Containing 191 true matches and 500 true non-matches
    (27.64% true matches)
  Identified 667 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   650  (97.45%)
          2 :    14  (2.10%)
          3 :     2  (0.30%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 667 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.000 : 498

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 691
  Number of unique weight vectors: 667

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (667, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 667 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 667 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 34 matches and 50 non-matches
    Purity of oracle classification:  0.595
    Entropy of oracle classification: 0.974
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 583 weight vectors
  Based on 34 matches and 50 non-matches
  Classified 274 matches and 309 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (274, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)
    (309, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)

Current size of match and non-match training data sets: 34 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 274 weight vectors
- Estimated match proportion 0.405

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 274 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 43 matches and 26 non-matches
    Purity of oracle classification:  0.623
    Entropy of oracle classification: 0.956
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(10)208_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987952
recall                 0.274247
f-measure              0.429319
da                           83
dm                            0
ndm                           0
tp                           82
fp                            1
tn                  4.76529e+07
fn                          217
Name: (10, 1 - acm diverg, 208), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)208_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 576
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 576 weight vectors
  Containing 161 true matches and 415 true non-matches
    (27.95% true matches)
  Identified 556 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   545  (98.02%)
          2 :     8  (1.44%)
          3 :     2  (0.36%)
          9 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 556 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 143
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 412

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 567
  Number of unique weight vectors: 555

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (555, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 555 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 555 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 473 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 119 matches and 354 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (119, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (354, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 119 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 119 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 45 matches and 6 non-matches
    Purity of oracle classification:  0.882
    Entropy of oracle classification: 0.523
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

83.0
Analisando o arquivo: diverg(10)39_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 39), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)39_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 579
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 579 weight vectors
  Containing 151 true matches and 428 true non-matches
    (26.08% true matches)
  Identified 562 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   553  (98.40%)
          2 :     6  (1.07%)
          3 :     2  (0.36%)
          8 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 562 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 136
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 425

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 571
  Number of unique weight vectors: 561

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (561, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 561 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 561 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 479 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 106 matches and 373 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (373, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 106 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 106 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 42 matches and 7 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(15)249_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 249), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)249_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 810
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 810 weight vectors
  Containing 219 true matches and 591 true non-matches
    (27.04% true matches)
  Identified 754 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   718  (95.23%)
          2 :    33  (4.38%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 754 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 570

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 809
  Number of unique weight vectors: 754

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (754, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 754 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 754 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 25 matches and 60 non-matches
    Purity of oracle classification:  0.706
    Entropy of oracle classification: 0.874
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 669 weight vectors
  Based on 25 matches and 60 non-matches
  Classified 122 matches and 547 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (122, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)
    (547, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)

Current size of match and non-match training data sets: 25 / 60

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 122 weight vectors
- Estimated match proportion 0.294

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 122 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)500_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 500), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)500_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)941_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 941), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)941_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 870
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 870 weight vectors
  Containing 186 true matches and 684 true non-matches
    (21.38% true matches)
  Identified 830 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   796  (95.90%)
          2 :    31  (3.73%)
          3 :     2  (0.24%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 830 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.000 : 664

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 870
  Number of unique weight vectors: 830

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (830, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 830 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 830 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 744 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 157 matches and 587 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (587, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 587 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 587 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.857, 0.417, 0.750, 0.500, 0.455] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 0 matches and 74 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(10)603_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976378
recall                 0.414716
f-measure               0.58216
da                          127
dm                            0
ndm                           0
tp                          124
fp                            3
tn                  4.76529e+07
fn                          175
Name: (10, 1 - acm diverg, 603), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)603_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 634
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 634 weight vectors
  Containing 137 true matches and 497 true non-matches
    (21.61% true matches)
  Identified 618 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   607  (98.22%)
          2 :     8  (1.29%)
          3 :     2  (0.32%)
          5 :     1  (0.16%)

Identified 0 non-pure unique weight vectors (from 618 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 123
     0.000 : 495

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 634
  Number of unique weight vectors: 618

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (618, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 618 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 618 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 33 matches and 50 non-matches
    Purity of oracle classification:  0.602
    Entropy of oracle classification: 0.970
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 535 weight vectors
  Based on 33 matches and 50 non-matches
  Classified 227 matches and 308 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (227, 0.6024096385542169, 0.9695235828220428, 0.39759036144578314)
    (308, 0.6024096385542169, 0.9695235828220428, 0.39759036144578314)

Current size of match and non-match training data sets: 33 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 308 weight vectors
- Estimated match proportion 0.398

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 308 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.462, 0.667, 0.600, 0.389, 0.615] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

127.0
Analisando o arquivo: diverg(20)408_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 408), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)408_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 118 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 118 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)139_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 139), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)139_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)735_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 735), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)735_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 895
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 895 weight vectors
  Containing 199 true matches and 696 true non-matches
    (22.23% true matches)
  Identified 844 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   810  (95.97%)
          2 :    31  (3.67%)
          3 :     2  (0.24%)
         17 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 844 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 675

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 894
  Number of unique weight vectors: 844

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (844, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 844 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 844 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 30 matches and 56 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 758 weight vectors
  Based on 30 matches and 56 non-matches
  Classified 189 matches and 569 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (189, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)
    (569, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)

Current size of match and non-match training data sets: 30 / 56

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 569 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 569 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.538, 0.789, 0.353, 0.545, 0.550] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.444, 0.643, 0.421, 0.200, 0.556] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [1.000, 0.000, 0.350, 0.455, 0.625, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.667, 0.444, 0.556, 0.222, 0.143] (False)
    [1.000, 0.000, 0.583, 0.389, 0.471, 0.545, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(10)597_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 597), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)597_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 492
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 492 weight vectors
  Containing 177 true matches and 315 true non-matches
    (35.98% true matches)
  Identified 474 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   462  (97.47%)
          2 :     9  (1.90%)
          3 :     2  (0.42%)
          6 :     1  (0.21%)

Identified 0 non-pure unique weight vectors (from 474 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.000 : 315

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 492
  Number of unique weight vectors: 474

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (474, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 474 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 474 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 26 matches and 54 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 394 weight vectors
  Based on 26 matches and 54 non-matches
  Classified 144 matches and 250 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.675, 0.9097361225311662, 0.325)
    (250, 0.675, 0.9097361225311662, 0.325)

Current size of match and non-match training data sets: 26 / 54

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 250 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 63

Farthest first selection of 63 weight vectors from 250 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.533, 0.000, 0.577, 0.783, 0.429, 0.615, 0.478] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.462, 0.609, 0.643, 0.706, 0.786] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.778, 0.577, 0.455, 0.387, 0.357] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.950, 0.000, 0.619, 0.800, 0.478, 0.280, 0.625] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.667, 0.722, 0.353, 0.545, 0.800] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 63 weight vectors
  The oracle will correctly classify 63 weight vectors and wrongly classify 0
  Classified 1 matches and 62 non-matches
    Purity of oracle classification:  0.984
    Entropy of oracle classification: 0.118
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 63 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(20)923_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 923), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)923_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1086
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1086 weight vectors
  Containing 220 true matches and 866 true non-matches
    (20.26% true matches)
  Identified 1030 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   994  (96.50%)
          2 :    33  (3.20%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1030 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1085
  Number of unique weight vectors: 1030

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1030, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1030 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1030 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 27 matches and 61 non-matches
    Purity of oracle classification:  0.693
    Entropy of oracle classification: 0.889
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 942 weight vectors
  Based on 27 matches and 61 non-matches
  Classified 142 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6931818181818182, 0.8894663896628687, 0.3068181818181818)
    (800, 0.6931818181818182, 0.8894663896628687, 0.3068181818181818)

Current size of match and non-match training data sets: 27 / 61

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 142 weight vectors
- Estimated match proportion 0.307

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)146_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (20, 1 - acm diverg, 146), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)146_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1017
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1017 weight vectors
  Containing 197 true matches and 820 true non-matches
    (19.37% true matches)
  Identified 975 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   940  (96.41%)
          2 :    32  (3.28%)
          3 :     2  (0.21%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 975 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.000 : 800

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1017
  Number of unique weight vectors: 975

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (975, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 975 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 975 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 888 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 108 matches and 780 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (108, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (780, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 780 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 780 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 14 matches and 57 non-matches
    Purity of oracle classification:  0.803
    Entropy of oracle classification: 0.716
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)811_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 811), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)811_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 822
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 822 weight vectors
  Containing 226 true matches and 596 true non-matches
    (27.49% true matches)
  Identified 765 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   728  (95.16%)
          2 :    34  (4.44%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 765 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 575

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 821
  Number of unique weight vectors: 765

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (765, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 765 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 765 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 680 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 153 matches and 527 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (527, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 527 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 527 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 5 matches and 67 non-matches
    Purity of oracle classification:  0.931
    Entropy of oracle classification: 0.364
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)753_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (20, 1 - acm diverg, 753), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)753_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1025
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1025 weight vectors
  Containing 198 true matches and 827 true non-matches
    (19.32% true matches)
  Identified 983 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   948  (96.44%)
          2 :    32  (3.26%)
          3 :     2  (0.20%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 983 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 807

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1025
  Number of unique weight vectors: 983

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (983, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 983 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 983 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 896 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 93 matches and 803 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (93, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (803, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 93 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 93 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)231_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 231), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)231_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 769
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 769 weight vectors
  Containing 196 true matches and 573 true non-matches
    (25.49% true matches)
  Identified 727 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   692  (95.19%)
          2 :    32  (4.40%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 727 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 553

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 769
  Number of unique weight vectors: 727

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (727, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 727 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 727 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 25 matches and 60 non-matches
    Purity of oracle classification:  0.706
    Entropy of oracle classification: 0.874
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 642 weight vectors
  Based on 25 matches and 60 non-matches
  Classified 98 matches and 544 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)
    (544, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)

Current size of match and non-match training data sets: 25 / 60

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 98 weight vectors
- Estimated match proportion 0.294

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 98 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.420, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 42 matches and 2 non-matches
    Purity of oracle classification:  0.955
    Entropy of oracle classification: 0.267
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)450_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981481
recall                 0.177258
f-measure              0.300283
da                           54
dm                            0
ndm                           0
tp                           53
fp                            1
tn                  4.76529e+07
fn                          246
Name: (15, 1 - acm diverg, 450), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)450_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 524
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 524 weight vectors
  Containing 210 true matches and 314 true non-matches
    (40.08% true matches)
  Identified 488 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   471  (96.52%)
          2 :    14  (2.87%)
          3 :     2  (0.41%)
         19 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 488 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 311

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 523
  Number of unique weight vectors: 488

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (488, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 488 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 488 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 33 matches and 47 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.978
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 408 weight vectors
  Based on 33 matches and 47 non-matches
  Classified 138 matches and 270 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.5875, 0.9777945702913884, 0.4125)
    (270, 0.5875, 0.9777945702913884, 0.4125)

Current size of match and non-match training data sets: 33 / 47

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 270 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 270 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.800, 0.636, 0.563, 0.545, 0.722] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 4 matches and 65 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.319
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

54.0
Analisando o arquivo: diverg(10)570_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 570), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)570_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 419
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 419 weight vectors
  Containing 202 true matches and 217 true non-matches
    (48.21% true matches)
  Identified 393 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   379  (96.44%)
          2 :    11  (2.80%)
          3 :     2  (0.51%)
         12 :     1  (0.25%)

Identified 1 non-pure unique weight vectors (from 393 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 216

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 418
  Number of unique weight vectors: 393

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (393, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 393 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 393 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 40 matches and 37 non-matches
    Purity of oracle classification:  0.519
    Entropy of oracle classification: 0.999
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 316 weight vectors
  Based on 40 matches and 37 non-matches
  Classified 136 matches and 180 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.5194805194805194, 0.9989047442823606, 0.5194805194805194)
    (180, 0.5194805194805194, 0.9989047442823606, 0.5194805194805194)

Current size of match and non-match training data sets: 40 / 37

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 136 weight vectors
- Estimated match proportion 0.519

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.933, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 50 matches and 6 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.491
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)636_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 636), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)636_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 845
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 845 weight vectors
  Containing 186 true matches and 659 true non-matches
    (22.01% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   771  (95.78%)
          2 :    31  (3.85%)
          3 :     2  (0.25%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.000 : 639

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 845
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 163 matches and 556 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (163, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (556, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 556 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 556 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 0 matches and 74 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(20)580_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 580), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)580_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)647_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 647), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)647_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 689
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 689 weight vectors
  Containing 219 true matches and 470 true non-matches
    (31.79% true matches)
  Identified 656 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   640  (97.56%)
          2 :    13  (1.98%)
          3 :     2  (0.30%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 656 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 469

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 688
  Number of unique weight vectors: 656

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (656, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 656 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 656 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 572 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 128 matches and 444 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (128, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (444, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 444 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 444 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.409, 0.654, 0.500, 0.516, 0.333] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.786, 0.833, 0.545, 0.478, 0.346] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 16 matches and 57 non-matches
    Purity of oracle classification:  0.781
    Entropy of oracle classification: 0.759
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)875_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (10, 1 - acm diverg, 875), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)875_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 987
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 987 weight vectors
  Containing 169 true matches and 818 true non-matches
    (17.12% true matches)
  Identified 950 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   919  (96.74%)
          2 :    28  (2.95%)
          3 :     2  (0.21%)
          6 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 950 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.000 : 798

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 987
  Number of unique weight vectors: 950

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (950, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 950 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 950 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 863 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 95 matches and 768 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (95, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (768, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 95 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 95 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(20)143_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 143), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)143_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 159 matches and 780 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (780, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 159 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 159 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)778_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 778), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)778_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 863
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 863 weight vectors
  Containing 156 true matches and 707 true non-matches
    (18.08% true matches)
  Identified 827 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   799  (96.61%)
          2 :    25  (3.02%)
          3 :     2  (0.24%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 827 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 140
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 686

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 855
  Number of unique weight vectors: 826

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (826, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 826 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 826 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 740 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 120 matches and 620 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (120, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (620, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 120 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 120 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.952, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 40 matches and 10 non-matches
    Purity of oracle classification:  0.800
    Entropy of oracle classification: 0.722
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  10
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)741_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.197324
f-measure              0.329609
da                           59
dm                            0
ndm                           0
tp                           59
fp                            0
tn                  4.76529e+07
fn                          240
Name: (10, 1 - acm diverg, 741), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)741_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 641
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 641 weight vectors
  Containing 190 true matches and 451 true non-matches
    (29.64% true matches)
  Identified 596 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   562  (94.30%)
          2 :    31  (5.20%)
          3 :     2  (0.34%)
         11 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 596 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 430

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 640
  Number of unique weight vectors: 596

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (596, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 596 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 596 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 26 matches and 56 non-matches
    Purity of oracle classification:  0.683
    Entropy of oracle classification: 0.901
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 514 weight vectors
  Based on 26 matches and 56 non-matches
  Classified 188 matches and 326 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (188, 0.6829268292682927, 0.9011701959974223, 0.3170731707317073)
    (326, 0.6829268292682927, 0.9011701959974223, 0.3170731707317073)

Current size of match and non-match training data sets: 26 / 56

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 326 weight vectors
- Estimated match proportion 0.317

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 326 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.269, 0.478, 0.750, 0.385, 0.455] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.500, 0.571, 0.467, 0.467, 0.389] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.179, 0.500, 0.412, 0.357] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.800, 0.667, 0.381, 0.550, 0.429] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.571, 0.286, 0.333, 0.571, 0.600] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 0 matches and 66 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

59.0
Analisando o arquivo: diverg(15)758_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 758), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)758_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 581
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 581 weight vectors
  Containing 187 true matches and 394 true non-matches
    (32.19% true matches)
  Identified 559 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   543  (97.14%)
          2 :    13  (2.33%)
          3 :     2  (0.36%)
          6 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 559 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 392

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 581
  Number of unique weight vectors: 559

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (559, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 559 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 559 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 477 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 141 matches and 336 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (336, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 141 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 47 matches and 7 non-matches
    Purity of oracle classification:  0.870
    Entropy of oracle classification: 0.556
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)997_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 997), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)997_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 156 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (800, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 156 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)201_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 201), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)201_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)812_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.197324
f-measure              0.329609
da                           59
dm                            0
ndm                           0
tp                           59
fp                            0
tn                  4.76529e+07
fn                          240
Name: (10, 1 - acm diverg, 812), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)812_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 893
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 893 weight vectors
  Containing 177 true matches and 716 true non-matches
    (19.82% true matches)
  Identified 848 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   814  (95.99%)
          2 :    31  (3.66%)
          3 :     2  (0.24%)
         11 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 848 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 695

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 892
  Number of unique weight vectors: 848

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (848, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 848 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 848 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 762 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 172 matches and 590 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (172, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (590, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 590 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 590 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.783, 0.357, 0.750, 0.412, 0.238] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.500, 0.600, 0.353, 0.611, 0.526] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 0 matches and 74 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

59.0
Analisando o arquivo: diverg(15)292_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 292), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)292_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 597
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 597 weight vectors
  Containing 201 true matches and 396 true non-matches
    (33.67% true matches)
  Identified 566 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   550  (97.17%)
          2 :    13  (2.30%)
          3 :     2  (0.35%)
         15 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 566 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 393

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 596
  Number of unique weight vectors: 566

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (566, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 566 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 31 matches and 51 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 484 weight vectors
  Based on 31 matches and 51 non-matches
  Classified 144 matches and 340 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)
    (340, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)

Current size of match and non-match training data sets: 31 / 51

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 144 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 144 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)605_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 605), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)605_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 793
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 793 weight vectors
  Containing 223 true matches and 570 true non-matches
    (28.12% true matches)
  Identified 754 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   735  (97.48%)
          2 :    16  (2.12%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 754 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 567

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 792
  Number of unique weight vectors: 754

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (754, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 754 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 754 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 32 matches and 53 non-matches
    Purity of oracle classification:  0.624
    Entropy of oracle classification: 0.956
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 669 weight vectors
  Based on 32 matches and 53 non-matches
  Classified 149 matches and 520 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)
    (520, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)

Current size of match and non-match training data sets: 32 / 53

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 149 weight vectors
- Estimated match proportion 0.376

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 52 matches and 4 non-matches
    Purity of oracle classification:  0.929
    Entropy of oracle classification: 0.371
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)276_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 276), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)276_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 955
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 955 weight vectors
  Containing 216 true matches and 739 true non-matches
    (22.62% true matches)
  Identified 900 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   864  (96.00%)
          2 :    33  (3.67%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 900 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 718

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 954
  Number of unique weight vectors: 900

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (900, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 900 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 900 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 23 matches and 63 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.838
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 814 weight vectors
  Based on 23 matches and 63 non-matches
  Classified 0 matches and 814 non-matches

40.0
Analisando o arquivo: diverg(15)948_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (15, 1 - acm diverg, 948), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)948_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1005
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1005 weight vectors
  Containing 195 true matches and 810 true non-matches
    (19.40% true matches)
  Identified 963 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   928  (96.37%)
          2 :    32  (3.32%)
          3 :     2  (0.21%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 963 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 790

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1005
  Number of unique weight vectors: 963

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (963, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 963 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 963 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 876 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 142 matches and 734 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (734, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 142 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 48 matches and 5 non-matches
    Purity of oracle classification:  0.906
    Entropy of oracle classification: 0.451
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(15)740_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 740), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)740_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 201 true matches and 752 true non-matches
    (21.09% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.26%)
          2 :    31  (3.41%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 110 matches and 711 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (110, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (711, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 711 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 711 vectors
  The selected farthest weight vectors are:
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 13 matches and 58 non-matches
    Purity of oracle classification:  0.817
    Entropy of oracle classification: 0.687
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)830_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 830), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)830_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1041
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1041 weight vectors
  Containing 213 true matches and 828 true non-matches
    (20.46% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   954  (96.46%)
          2 :    32  (3.24%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 807

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1040
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 109 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 109 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)737_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 737), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)737_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 91 matches and 857 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (857, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 857 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 857 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 18 matches and 52 non-matches
    Purity of oracle classification:  0.743
    Entropy of oracle classification: 0.822
    Number of true matches:      18
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)91_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 91), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)91_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1031
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1031 weight vectors
  Containing 212 true matches and 819 true non-matches
    (20.56% true matches)
  Identified 979 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   944  (96.42%)
          2 :    32  (3.27%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 979 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 798

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1030
  Number of unique weight vectors: 979

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (979, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 979 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 979 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 892 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 136 matches and 756 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (756, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 756 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 10 matches and 63 non-matches
    Purity of oracle classification:  0.863
    Entropy of oracle classification: 0.576
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)581_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 581), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)581_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)339_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 339), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)339_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 810
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 810 weight vectors
  Containing 223 true matches and 587 true non-matches
    (27.53% true matches)
  Identified 756 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   719  (95.11%)
          2 :    34  (4.50%)
          3 :     2  (0.26%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 756 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 566

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 809
  Number of unique weight vectors: 756

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (756, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 756 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 671 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 94 matches and 577 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (577, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 94 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 94 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.950, 0.923, 0.941] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 44 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)841_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 841), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)841_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)497_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 497), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)497_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 159 matches and 779 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (779, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 779 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 779 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 3 matches and 72 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)639_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 639), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)639_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 829
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 829 weight vectors
  Containing 214 true matches and 615 true non-matches
    (25.81% true matches)
  Identified 775 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   740  (95.48%)
          2 :    32  (4.13%)
          3 :     2  (0.26%)
         19 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 775 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 594

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 828
  Number of unique weight vectors: 775

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (775, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 775 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 775 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 690 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 150 matches and 540 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (540, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 150 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 47 matches and 8 non-matches
    Purity of oracle classification:  0.855
    Entropy of oracle classification: 0.598
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)548_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 548), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)548_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 146 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (538, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 146 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 50 matches and 4 non-matches
    Purity of oracle classification:  0.926
    Entropy of oracle classification: 0.381
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)606_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 606), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)606_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 831
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 831 weight vectors
  Containing 227 true matches and 604 true non-matches
    (27.32% true matches)
  Identified 774 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   737  (95.22%)
          2 :    34  (4.39%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 774 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 583

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 830
  Number of unique weight vectors: 774

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (774, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 774 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 774 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 689 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 151 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (538, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 538 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 538 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 9 matches and 64 non-matches
    Purity of oracle classification:  0.877
    Entropy of oracle classification: 0.539
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)3_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 3), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)3_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 744
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 744 weight vectors
  Containing 197 true matches and 547 true non-matches
    (26.48% true matches)
  Identified 702 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   667  (95.01%)
          2 :    32  (4.56%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 702 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.000 : 527

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 744
  Number of unique weight vectors: 702

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (702, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 702 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 702 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 618 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 143 matches and 475 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (475, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 475 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 475 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.714, 0.727, 0.750, 0.294, 0.833] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 4 matches and 67 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.313
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)634_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 634), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)634_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 789
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 789 weight vectors
  Containing 225 true matches and 564 true non-matches
    (28.52% true matches)
  Identified 750 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   731  (97.47%)
          2 :    16  (2.13%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 750 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 561

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 788
  Number of unique weight vectors: 750

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (750, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 750 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 750 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 32 matches and 53 non-matches
    Purity of oracle classification:  0.624
    Entropy of oracle classification: 0.956
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 665 weight vectors
  Based on 32 matches and 53 non-matches
  Classified 161 matches and 504 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (161, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)
    (504, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)

Current size of match and non-match training data sets: 32 / 53

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 504 weight vectors
- Estimated match proportion 0.376

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 504 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.556, 0.429, 0.500, 0.700, 0.643] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.667, 0.400, 0.583, 0.563] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 3 matches and 73 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.240
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)606_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 606), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)606_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 770
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 770 weight vectors
  Containing 212 true matches and 558 true non-matches
    (27.53% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   683  (95.13%)
          2 :    32  (4.46%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 537

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 769
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 293 matches and 341 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (293, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (341, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 293 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 293 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.600, 1.000, 0.217, 0.132, 0.167, 0.125, 0.188] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 43 matches and 25 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  25
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)84_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 84), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)84_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1018
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1018 weight vectors
  Containing 220 true matches and 798 true non-matches
    (21.61% true matches)
  Identified 964 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   927  (96.16%)
          2 :    34  (3.53%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 964 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 777

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1017
  Number of unique weight vectors: 964

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (964, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 964 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 964 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.800, 0.000, 0.444, 0.545, 0.333, 0.111, 0.533] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 33 matches and 54 non-matches
    Purity of oracle classification:  0.621
    Entropy of oracle classification: 0.958
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 877 weight vectors
  Based on 33 matches and 54 non-matches
  Classified 298 matches and 579 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (298, 0.6206896551724138, 0.9575534837147482, 0.3793103448275862)
    (579, 0.6206896551724138, 0.9575534837147482, 0.3793103448275862)

Current size of match and non-match training data sets: 33 / 54

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 298 weight vectors
- Estimated match proportion 0.379

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 298 vectors
  The selected farthest weight vectors are:
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.261, 0.174, 0.148, 0.186, 0.148] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 43 matches and 26 non-matches
    Purity of oracle classification:  0.623
    Entropy of oracle classification: 0.956
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)400_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (15, 1 - acm diverg, 400), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)400_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 997
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 997 weight vectors
  Containing 167 true matches and 830 true non-matches
    (16.75% true matches)
  Identified 958 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   929  (96.97%)
          2 :    26  (2.71%)
          3 :     2  (0.21%)
         10 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 958 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 809

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 996
  Number of unique weight vectors: 958

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (958, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 958 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 958 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 871 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 280 matches and 591 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (280, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (591, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 591 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 591 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(10)357_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 357), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)357_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 479
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 479 weight vectors
  Containing 175 true matches and 304 true non-matches
    (36.53% true matches)
  Identified 461 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   449  (97.40%)
          2 :     9  (1.95%)
          3 :     2  (0.43%)
          6 :     1  (0.22%)

Identified 0 non-pure unique weight vectors (from 461 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 157
     0.000 : 304

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 479
  Number of unique weight vectors: 461

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (461, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 461 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 461 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.364, 0.619, 0.471, 0.600, 0.533] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 28 matches and 51 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.938
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 382 weight vectors
  Based on 28 matches and 51 non-matches
  Classified 131 matches and 251 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.6455696202531646, 0.9379626436434423, 0.35443037974683544)
    (251, 0.6455696202531646, 0.9379626436434423, 0.35443037974683544)

Current size of match and non-match training data sets: 28 / 51

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 251 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 65

Farthest first selection of 65 weight vectors from 251 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.579, 0.583, 0.522, 0.417, 0.563] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.923, 0.667, 0.667, 0.412, 0.571] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 0.000, 0.750, 0.714, 0.500, 0.412, 0.762] (False)
    [1.000, 0.000, 0.565, 0.857, 0.833, 0.412, 0.667] (False)
    [1.000, 0.000, 0.846, 0.684, 0.529, 0.727, 0.700] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.591, 0.762, 0.647, 0.636, 0.550] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.333, 0.667, 0.400, 0.583, 0.563] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 2 matches and 63 non-matches
    Purity of oracle classification:  0.969
    Entropy of oracle classification: 0.198
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(15)344_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 344), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)344_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 744
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 744 weight vectors
  Containing 220 true matches and 524 true non-matches
    (29.57% true matches)
  Identified 708 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   689  (97.32%)
          2 :    16  (2.26%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 708 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 521

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 743
  Number of unique weight vectors: 708

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (708, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 708 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 708 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 624 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 151 matches and 473 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (473, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 151 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)887_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 887), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)887_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1082
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1082 weight vectors
  Containing 209 true matches and 873 true non-matches
    (19.32% true matches)
  Identified 1035 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1000  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1035 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1081
  Number of unique weight vectors: 1035

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1035, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1035 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1035 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 947 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 846 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 846 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)298_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 298), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)298_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 480
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 480 weight vectors
  Containing 154 true matches and 326 true non-matches
    (32.08% true matches)
  Identified 467 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   458  (98.07%)
          2 :     6  (1.28%)
          3 :     2  (0.43%)
          4 :     1  (0.21%)

Identified 0 non-pure unique weight vectors (from 467 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 141
     0.000 : 326

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 480
  Number of unique weight vectors: 467

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (467, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 467 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 467 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 26 matches and 53 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 388 weight vectors
  Based on 26 matches and 53 non-matches
  Classified 109 matches and 279 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.6708860759493671, 0.9140185106642176, 0.3291139240506329)
    (279, 0.6708860759493671, 0.9140185106642176, 0.3291139240506329)

Current size of match and non-match training data sets: 26 / 53

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 109 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 45 matches and 3 non-matches
    Purity of oracle classification:  0.938
    Entropy of oracle classification: 0.337
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(15)794_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 794), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)794_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 742
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 742 weight vectors
  Containing 220 true matches and 522 true non-matches
    (29.65% true matches)
  Identified 706 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   687  (97.31%)
          2 :    16  (2.27%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 706 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 519

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 741
  Number of unique weight vectors: 706

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (706, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 706 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 622 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 151 matches and 471 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (471, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 471 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 471 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 6 matches and 67 non-matches
    Purity of oracle classification:  0.918
    Entropy of oracle classification: 0.410
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)333_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 333), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)333_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 804
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 804 weight vectors
  Containing 208 true matches and 596 true non-matches
    (25.87% true matches)
  Identified 757 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   722  (95.38%)
          2 :    32  (4.23%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 757 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 575

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 803
  Number of unique weight vectors: 757

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (757, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 757 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 757 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 672 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 139 matches and 533 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (139, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (533, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 139 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 139 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)304_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 304), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)304_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)58_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 58), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)58_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 961
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 961 weight vectors
  Containing 217 true matches and 744 true non-matches
    (22.58% true matches)
  Identified 906 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   870  (96.03%)
          2 :    33  (3.64%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 906 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 723

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 960
  Number of unique weight vectors: 906

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (906, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 906 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 906 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 819 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 135 matches and 684 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (684, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 135 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 135 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 50 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.139
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)245_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 245), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)245_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 146 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (538, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 146 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 50 matches and 4 non-matches
    Purity of oracle classification:  0.926
    Entropy of oracle classification: 0.381
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)963_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 963), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)963_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 690
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 690 weight vectors
  Containing 217 true matches and 473 true non-matches
    (31.45% true matches)
  Identified 635 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   599  (94.33%)
          2 :    33  (5.20%)
          3 :     2  (0.31%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 635 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 452

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 689
  Number of unique weight vectors: 635

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (635, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 635 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 635 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 24 matches and 59 non-matches
    Purity of oracle classification:  0.711
    Entropy of oracle classification: 0.868
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 552 weight vectors
  Based on 24 matches and 59 non-matches
  Classified 46 matches and 506 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (46, 0.7108433734939759, 0.8676293117125106, 0.2891566265060241)
    (506, 0.7108433734939759, 0.8676293117125106, 0.2891566265060241)

Current size of match and non-match training data sets: 24 / 59

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 506 weight vectors
- Estimated match proportion 0.289

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 506 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.423, 0.478, 0.500, 0.813, 0.545] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 22 matches and 46 non-matches
    Purity of oracle classification:  0.676
    Entropy of oracle classification: 0.908
    Number of true matches:      22
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)31_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 31), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)31_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)316_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 316), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)316_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 739
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 739 weight vectors
  Containing 212 true matches and 527 true non-matches
    (28.69% true matches)
  Identified 687 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   652  (94.91%)
          2 :    32  (4.66%)
          3 :     2  (0.29%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 687 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 506

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 738
  Number of unique weight vectors: 687

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (687, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 687 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 687 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 603 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 291 matches and 312 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (291, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (312, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 291 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 291 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 44 matches and 24 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  24
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)60_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 60), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)60_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 221 true matches and 855 true non-matches
    (20.54% true matches)
  Identified 1020 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   984  (96.47%)
          2 :    33  (3.24%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1020 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 834

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1020

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1020, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1020 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1020 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 933 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 170 matches and 763 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (170, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (763, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 170 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 170 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 43 matches and 15 non-matches
    Purity of oracle classification:  0.741
    Entropy of oracle classification: 0.825
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  15
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)822_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 822), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)822_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 786
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 786 weight vectors
  Containing 208 true matches and 578 true non-matches
    (26.46% true matches)
  Identified 757 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   740  (97.75%)
          2 :    14  (1.85%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 757 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 575

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 785
  Number of unique weight vectors: 757

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (757, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 757 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 757 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 672 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 131 matches and 541 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (541, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 541 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 541 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 12 matches and 61 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.645
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)299_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 299), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)299_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 616
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 616 weight vectors
  Containing 201 true matches and 415 true non-matches
    (32.63% true matches)
  Identified 582 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   566  (97.25%)
          2 :    13  (2.23%)
          3 :     2  (0.34%)
         18 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 582 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 412

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 615
  Number of unique weight vectors: 582

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (582, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 582 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 582 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 31 matches and 51 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 500 weight vectors
  Based on 31 matches and 51 non-matches
  Classified 142 matches and 358 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)
    (358, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)

Current size of match and non-match training data sets: 31 / 51

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 142 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 51 matches and 4 non-matches
    Purity of oracle classification:  0.927
    Entropy of oracle classification: 0.376
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)821_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 821), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)821_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 213 true matches and 586 true non-matches
    (26.66% true matches)
  Identified 747 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   712  (95.31%)
          2 :    32  (4.28%)
          3 :     2  (0.27%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 747 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 565

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 747

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (747, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 747 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 747 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 662 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 86 matches and 576 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (86, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (576, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 86 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 86 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.956, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.950, 0.923, 0.941] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 43 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)925_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976744
recall                 0.421405
f-measure              0.588785
da                          129
dm                            0
ndm                           0
tp                          126
fp                            3
tn                  4.76529e+07
fn                          173
Name: (15, 1 - acm diverg, 925), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)925_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 946
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 946 weight vectors
  Containing 138 true matches and 808 true non-matches
    (14.59% true matches)
  Identified 912 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   883  (96.82%)
          2 :    26  (2.85%)
          3 :     2  (0.22%)
          5 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 912 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 124
     0.000 : 788

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 946
  Number of unique weight vectors: 912

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (912, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 912 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 912 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 825 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 89 matches and 736 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (89, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (736, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 89 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 89 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 38 matches and 5 non-matches
    Purity of oracle classification:  0.884
    Entropy of oracle classification: 0.519
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

129.0
Analisando o arquivo: diverg(10)195_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 195), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)195_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 848
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 848 weight vectors
  Containing 189 true matches and 659 true non-matches
    (22.29% true matches)
  Identified 808 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   774  (95.79%)
          2 :    31  (3.84%)
          3 :     2  (0.25%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 808 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.000 : 639

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 848
  Number of unique weight vectors: 808

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (808, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 808 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 808 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 30 matches and 56 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 722 weight vectors
  Based on 30 matches and 56 non-matches
  Classified 168 matches and 554 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (168, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)
    (554, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)

Current size of match and non-match training data sets: 30 / 56

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 554 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 554 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(15)802_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 802), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)802_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 506
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 506 weight vectors
  Containing 205 true matches and 301 true non-matches
    (40.51% true matches)
  Identified 477 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   460  (96.44%)
          2 :    14  (2.94%)
          3 :     2  (0.42%)
         12 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 477 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 298

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 505
  Number of unique weight vectors: 477

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (477, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 477 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 477 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 36 matches and 44 non-matches
    Purity of oracle classification:  0.550
    Entropy of oracle classification: 0.993
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 397 weight vectors
  Based on 36 matches and 44 non-matches
  Classified 281 matches and 116 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (281, 0.55, 0.9927744539878084, 0.45)
    (116, 0.55, 0.9927744539878084, 0.45)

Current size of match and non-match training data sets: 36 / 44

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 116 weight vectors
- Estimated match proportion 0.450

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 116 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 1 matches and 51 non-matches
    Purity of oracle classification:  0.981
    Entropy of oracle classification: 0.137
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)740_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 740), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)740_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 779
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 779 weight vectors
  Containing 222 true matches and 557 true non-matches
    (28.50% true matches)
  Identified 725 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   688  (94.90%)
          2 :    34  (4.69%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 725 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 536

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 778
  Number of unique weight vectors: 725

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (725, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 725 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 725 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 32 matches and 53 non-matches
    Purity of oracle classification:  0.624
    Entropy of oracle classification: 0.956
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 640 weight vectors
  Based on 32 matches and 53 non-matches
  Classified 300 matches and 340 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (300, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)
    (340, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)

Current size of match and non-match training data sets: 32 / 53

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 300 weight vectors
- Estimated match proportion 0.376

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 300 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.600, 1.000, 0.217, 0.132, 0.167, 0.125, 0.188] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 44 matches and 25 non-matches
    Purity of oracle classification:  0.638
    Entropy of oracle classification: 0.945
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  25
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)944_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 944), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)944_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 177 matches and 761 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (177, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (761, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 761 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 761 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)84_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 84), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)84_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 801
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 801 weight vectors
  Containing 220 true matches and 581 true non-matches
    (27.47% true matches)
  Identified 763 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   745  (97.64%)
          2 :    15  (1.97%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 763 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 578

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 800
  Number of unique weight vectors: 763

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (763, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 763 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 763 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 678 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 135 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 135 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 135 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)473_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 473), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)473_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 945
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 945 weight vectors
  Containing 219 true matches and 726 true non-matches
    (23.17% true matches)
  Identified 890 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   854  (95.96%)
          2 :    33  (3.71%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 890 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 705

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 944
  Number of unique weight vectors: 890

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (890, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 890 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 890 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 804 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 130 matches and 674 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (674, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 130 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)850_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.197324
f-measure              0.329609
da                           59
dm                            0
ndm                           0
tp                           59
fp                            0
tn                  4.76529e+07
fn                          240
Name: (10, 1 - acm diverg, 850), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)850_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 661
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 661 weight vectors
  Containing 198 true matches and 463 true non-matches
    (29.95% true matches)
  Identified 616 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   582  (94.48%)
          2 :    31  (5.03%)
          3 :     2  (0.32%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 616 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 442

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 660
  Number of unique weight vectors: 616

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (616, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 616 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 616 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 533 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 157 matches and 376 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (376, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 376 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 376 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.750, 0.524, 0.400, 0.813, 0.611] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.600, 0.857, 0.579, 0.286, 0.545] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.417, 0.750, 0.500, 0.455] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.857, 0.444, 0.556, 0.235, 0.500] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.714, 0.318, 0.583, 0.417, 0.727] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 1 matches and 69 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.108
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

59.0
Analisando o arquivo: diverg(20)845_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 845), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)845_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)843_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 843), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)843_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 932
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 932 weight vectors
  Containing 200 true matches and 732 true non-matches
    (21.46% true matches)
  Identified 887 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   853  (96.17%)
          2 :    31  (3.49%)
          3 :     2  (0.23%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 887 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 711

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 931
  Number of unique weight vectors: 887

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (887, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 887 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 887 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 801 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 158 matches and 643 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (643, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 643 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 643 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 1 matches and 73 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.103
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)7_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.982759
recall                 0.190635
f-measure              0.319328
da                           58
dm                            0
ndm                           0
tp                           57
fp                            1
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 7), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)7_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 932
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 932 weight vectors
  Containing 200 true matches and 732 true non-matches
    (21.46% true matches)
  Identified 881 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   847  (96.14%)
          2 :    31  (3.52%)
          3 :     2  (0.23%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 881 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 711

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 931
  Number of unique weight vectors: 881

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (881, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 881 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 881 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 795 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 140 matches and 655 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (655, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 140 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 48 matches and 5 non-matches
    Purity of oracle classification:  0.906
    Entropy of oracle classification: 0.451
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)777_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 777), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)777_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 697
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 697 weight vectors
  Containing 198 true matches and 499 true non-matches
    (28.41% true matches)
  Identified 652 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   618  (94.79%)
          2 :    31  (4.75%)
          3 :     2  (0.31%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 652 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 478

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 696
  Number of unique weight vectors: 652

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (652, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 652 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 652 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 25 matches and 58 non-matches
    Purity of oracle classification:  0.699
    Entropy of oracle classification: 0.883
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 569 weight vectors
  Based on 25 matches and 58 non-matches
  Classified 143 matches and 426 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6987951807228916, 0.8827586787955115, 0.30120481927710846)
    (426, 0.6987951807228916, 0.8827586787955115, 0.30120481927710846)

Current size of match and non-match training data sets: 25 / 58

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 426 weight vectors
- Estimated match proportion 0.301

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 426 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.423, 0.478, 0.500, 0.813, 0.545] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.269, 0.478, 0.750, 0.385, 0.455] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.636, 0.429, 0.632, 0.250, 0.750] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 8 matches and 60 non-matches
    Purity of oracle classification:  0.882
    Entropy of oracle classification: 0.523
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)613_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 613), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)613_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 662
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 662 weight vectors
  Containing 217 true matches and 445 true non-matches
    (32.78% true matches)
  Identified 629 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   613  (97.46%)
          2 :    13  (2.07%)
          3 :     2  (0.32%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 629 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 444

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 661
  Number of unique weight vectors: 629

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (629, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 629 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 629 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 33 matches and 50 non-matches
    Purity of oracle classification:  0.602
    Entropy of oracle classification: 0.970
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 546 weight vectors
  Based on 33 matches and 50 non-matches
  Classified 176 matches and 370 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (176, 0.6024096385542169, 0.9695235828220428, 0.39759036144578314)
    (370, 0.6024096385542169, 0.9695235828220428, 0.39759036144578314)

Current size of match and non-match training data sets: 33 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 176 weight vectors
- Estimated match proportion 0.398

Sample size for this cluster: 61

Farthest first selection of 61 weight vectors from 176 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.890, 1.000, 0.281, 0.136, 0.183, 0.250, 0.163] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 61 weight vectors
  The oracle will correctly classify 61 weight vectors and wrongly classify 0
  Classified 44 matches and 17 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  17
    Number of false non-matches: 0

Deleted 61 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)137_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 137), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)137_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)544_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 544), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)544_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)147_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 147), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)147_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 794
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 794 weight vectors
  Containing 221 true matches and 573 true non-matches
    (27.83% true matches)
  Identified 740 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   703  (95.00%)
          2 :    34  (4.59%)
          3 :     2  (0.27%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 740 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 552

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 793
  Number of unique weight vectors: 740

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (740, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 740 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 740 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 655 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 154 matches and 501 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (154, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (501, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 501 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 501 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 4 matches and 68 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)44_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 44), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)44_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 717
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 717 weight vectors
  Containing 193 true matches and 524 true non-matches
    (26.92% true matches)
  Identified 675 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   640  (94.81%)
          2 :    32  (4.74%)
          3 :     2  (0.30%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 675 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.000 : 504

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 717
  Number of unique weight vectors: 675

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (675, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 675 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 675 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 591 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 285 matches and 306 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (285, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (306, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 306 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 306 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.833, 0.571, 0.727, 0.647, 0.857] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.833, 0.364, 0.417, 0.800, 0.545] (False)
    [0.800, 0.000, 0.625, 0.571, 0.467, 0.474, 0.667] (False)
    [1.000, 0.000, 0.636, 0.429, 0.632, 0.250, 0.750] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.385, 0.391, 0.667, 0.579, 0.824] (False)
    [1.000, 0.000, 0.750, 0.429, 0.526, 0.500, 0.846] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [1.000, 0.000, 0.067, 0.550, 0.818, 0.727, 0.762] (False)
    [1.000, 0.000, 0.556, 0.222, 0.444, 0.429, 0.300] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.500, 0.600, 0.294, 0.600, 0.500] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.000, 0.700, 0.818, 0.444, 0.619] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 0 matches and 69 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(10)293_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 293), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)293_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 343
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 343 weight vectors
  Containing 191 true matches and 152 true non-matches
    (55.69% true matches)
  Identified 322 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   308  (95.65%)
          2 :    11  (3.42%)
          3 :     2  (0.62%)
          7 :     1  (0.31%)

Identified 0 non-pure unique weight vectors (from 322 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.000 : 152

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 343
  Number of unique weight vectors: 322

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (322, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 322 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 322 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 51 matches and 23 non-matches
    Purity of oracle classification:  0.689
    Entropy of oracle classification: 0.894
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  23
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 248 weight vectors
  Based on 51 matches and 23 non-matches
  Classified 248 matches and 0 non-matches

68.0
Analisando o arquivo: diverg(10)802_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 802), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)802_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 740
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 740 weight vectors
  Containing 202 true matches and 538 true non-matches
    (27.30% true matches)
  Identified 690 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   656  (95.07%)
          2 :    31  (4.49%)
          3 :     2  (0.29%)
         16 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 690 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 517

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 739
  Number of unique weight vectors: 690

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (690, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 690 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 690 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 606 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 140 matches and 466 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (466, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 140 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 47 matches and 7 non-matches
    Purity of oracle classification:  0.870
    Entropy of oracle classification: 0.556
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(10)992_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 992), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)992_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 814
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 814 weight vectors
  Containing 220 true matches and 594 true non-matches
    (27.03% true matches)
  Identified 758 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   722  (95.25%)
          2 :    33  (4.35%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 758 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 573

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 813
  Number of unique weight vectors: 758

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (758, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 758 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 758 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 673 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 146 matches and 527 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (527, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 146 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)246_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 246), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)246_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 883
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 883 weight vectors
  Containing 177 true matches and 706 true non-matches
    (20.05% true matches)
  Identified 844 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   814  (96.45%)
          2 :    27  (3.20%)
          3 :     2  (0.24%)
          9 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 844 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 158
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 685

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 874
  Number of unique weight vectors: 843

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (843, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 843 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 843 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 757 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 139 matches and 618 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (139, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (618, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 618 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 618 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 1 matches and 73 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.103
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(15)621_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 621), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)621_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 721
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 721 weight vectors
  Containing 217 true matches and 504 true non-matches
    (30.10% true matches)
  Identified 666 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   630  (94.59%)
          2 :    33  (4.95%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 666 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 483

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 720
  Number of unique weight vectors: 666

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (666, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 666 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 666 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 23 matches and 61 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 582 weight vectors
  Based on 23 matches and 61 non-matches
  Classified 0 matches and 582 non-matches

40.0
Analisando o arquivo: diverg(15)579_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 579), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)579_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 882
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 882 weight vectors
  Containing 187 true matches and 695 true non-matches
    (21.20% true matches)
  Identified 842 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   808  (95.96%)
          2 :    31  (3.68%)
          3 :     2  (0.24%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 842 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 675

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 882
  Number of unique weight vectors: 842

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (842, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 842 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 842 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 756 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 149 matches and 607 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (607, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 607 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 607 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 1 matches and 73 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.103
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)670_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 670), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)670_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 118 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 118 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)779_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 779), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)779_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)222_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 222), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)222_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)380_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990385
recall                 0.344482
f-measure              0.511166
da                          104
dm                            0
ndm                           0
tp                          103
fp                            1
tn                  4.76529e+07
fn                          196
Name: (10, 1 - acm diverg, 380), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)380_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 720
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 720 weight vectors
  Containing 160 true matches and 560 true non-matches
    (22.22% true matches)
  Identified 699 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   688  (98.43%)
          2 :     8  (1.14%)
          3 :     2  (0.29%)
         10 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 699 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 141
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 557

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 719
  Number of unique weight vectors: 699

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (699, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 699 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 699 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 615 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 115 matches and 500 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (115, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (500, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 115 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 115 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 41 matches and 10 non-matches
    Purity of oracle classification:  0.804
    Entropy of oracle classification: 0.714
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  10
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

104.0
Analisando o arquivo: diverg(20)13_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 13), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)13_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1041
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1041 weight vectors
  Containing 213 true matches and 828 true non-matches
    (20.46% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   954  (96.46%)
          2 :    32  (3.24%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 807

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1040
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 109 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 109 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)442_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 442), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)442_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 954
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 954 weight vectors
  Containing 205 true matches and 749 true non-matches
    (21.49% true matches)
  Identified 903 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   869  (96.23%)
          2 :    31  (3.43%)
          3 :     2  (0.22%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 903 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 728

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 953
  Number of unique weight vectors: 903

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (903, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 903 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 903 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 816 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 112 matches and 704 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (704, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 112 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 46 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)592_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 592), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)592_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 794
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 794 weight vectors
  Containing 213 true matches and 581 true non-matches
    (26.83% true matches)
  Identified 758 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   741  (97.76%)
          2 :    14  (1.85%)
          3 :     2  (0.26%)
         19 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 758 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 578

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 793
  Number of unique weight vectors: 758

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (758, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 758 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 758 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 673 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 136 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (537, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 136 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 51 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.232
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)400_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979592
recall                  0.32107
f-measure              0.483627
da                           98
dm                            0
ndm                           0
tp                           96
fp                            2
tn                  4.76529e+07
fn                          203
Name: (10, 1 - acm diverg, 400), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)400_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 687
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 687 weight vectors
  Containing 167 true matches and 520 true non-matches
    (24.31% true matches)
  Identified 650 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   619  (95.23%)
          2 :    28  (4.31%)
          3 :     2  (0.31%)
          6 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 650 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 150
     0.000 : 500

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 687
  Number of unique weight vectors: 650

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (650, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 650 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 650 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 31 matches and 52 non-matches
    Purity of oracle classification:  0.627
    Entropy of oracle classification: 0.953
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 567 weight vectors
  Based on 31 matches and 52 non-matches
  Classified 260 matches and 307 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (260, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)
    (307, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)

Current size of match and non-match training data sets: 31 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 307 weight vectors
- Estimated match proportion 0.373

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 307 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.556, 0.348, 0.467, 0.636, 0.412] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [0.800, 0.000, 0.444, 0.545, 0.333, 0.111, 0.533] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.462, 0.667, 0.636, 0.368, 0.500] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 0 matches and 69 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

98.0
Analisando o arquivo: diverg(15)664_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 664), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)664_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1081
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1081 weight vectors
  Containing 226 true matches and 855 true non-matches
    (20.91% true matches)
  Identified 1024 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   987  (96.39%)
          2 :    34  (3.32%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1024 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 834

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1080
  Number of unique weight vectors: 1024

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1024, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1024 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1024 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 937 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 173 matches and 764 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (173, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (764, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 764 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 78

Farthest first selection of 78 weight vectors from 764 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 1 matches and 77 non-matches
    Purity of oracle classification:  0.987
    Entropy of oracle classification: 0.099
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)514_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 514), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)514_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 401
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 401 weight vectors
  Containing 219 true matches and 182 true non-matches
    (54.61% true matches)
  Identified 368 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   352  (95.65%)
          2 :    13  (3.53%)
          3 :     2  (0.54%)
         17 :     1  (0.27%)

Identified 1 non-pure unique weight vectors (from 368 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 181

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 400
  Number of unique weight vectors: 368

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (368, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 368 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 368 vectors
  The selected farthest weight vectors are:
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 44 matches and 32 non-matches
    Purity of oracle classification:  0.579
    Entropy of oracle classification: 0.982
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  32
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 292 weight vectors
  Based on 44 matches and 32 non-matches
  Classified 154 matches and 138 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (154, 0.5789473684210527, 0.9819407868640977, 0.5789473684210527)
    (138, 0.5789473684210527, 0.9819407868640977, 0.5789473684210527)

Current size of match and non-match training data sets: 44 / 32

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 138 weight vectors
- Estimated match proportion 0.579

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 138 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.821, 1.000, 0.275, 0.297, 0.227, 0.255, 0.152] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.261, 0.174, 0.148, 0.186, 0.148] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.881, 1.000, 0.211, 0.250, 0.129, 0.250, 0.211] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.750, 1.000, 0.214, 0.184, 0.250, 0.267, 0.111] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.750, 1.000, 0.146, 0.130, 0.176, 0.318, 0.167] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.750, 1.000, 0.243, 0.243, 0.214, 0.111, 0.132] (False)
    [0.929, 1.000, 0.250, 0.193, 0.250, 0.164, 0.213] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.592, 1.000, 0.179, 0.205, 0.156, 0.273, 0.180] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.600, 0.944, 0.250, 0.200, 0.186, 0.136, 0.118] (False)
    [0.663, 1.000, 0.132, 0.143, 0.241, 0.174, 0.167] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.902, 1.000, 0.182, 0.071, 0.182, 0.222, 0.190] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.747, 1.000, 0.231, 0.167, 0.107, 0.222, 0.125] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 2 matches and 54 non-matches
    Purity of oracle classification:  0.964
    Entropy of oracle classification: 0.222
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)616_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 616), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)616_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 790
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 790 weight vectors
  Containing 208 true matches and 582 true non-matches
    (26.33% true matches)
  Identified 761 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   744  (97.77%)
          2 :    14  (1.84%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 761 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 789
  Number of unique weight vectors: 761

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (761, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 761 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 761 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 676 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 133 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 133 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 133 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.420, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)410_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 410), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)410_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 271
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 271 weight vectors
  Containing 163 true matches and 108 true non-matches
    (60.15% true matches)
  Identified 254 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   246  (96.85%)
          2 :     5  (1.97%)
          3 :     2  (0.79%)
          9 :     1  (0.39%)

Identified 1 non-pure unique weight vectors (from 254 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 146
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 107

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 262
  Number of unique weight vectors: 253

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (253, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 253 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 70

Perform initial selection using "far" method

Farthest first selection of 70 weight vectors from 253 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 34 matches and 36 non-matches
    Purity of oracle classification:  0.514
    Entropy of oracle classification: 0.999
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  36
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 183 weight vectors
  Based on 34 matches and 36 non-matches
  Classified 121 matches and 62 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 70
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (121, 0.5142857142857142, 0.9994110647387553, 0.4857142857142857)
    (62, 0.5142857142857142, 0.9994110647387553, 0.4857142857142857)

Current size of match and non-match training data sets: 34 / 36

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 121 weight vectors
- Estimated match proportion 0.486

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 121 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 45 matches and 9 non-matches
    Purity of oracle classification:  0.833
    Entropy of oracle classification: 0.650
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(20)190_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 190), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)190_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)370_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 370), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)370_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 910
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 910 weight vectors
  Containing 214 true matches and 696 true non-matches
    (23.52% true matches)
  Identified 855 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   819  (95.79%)
          2 :    33  (3.86%)
          3 :     2  (0.23%)
         19 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 855 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 675

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 909
  Number of unique weight vectors: 855

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (855, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 855 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 855 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 30 matches and 56 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 769 weight vectors
  Based on 30 matches and 56 non-matches
  Classified 199 matches and 570 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (199, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)
    (570, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)

Current size of match and non-match training data sets: 30 / 56

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 570 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 570 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.808, 0.478, 0.636, 0.786, 0.500] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)129_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 129), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)129_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 359
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 359 weight vectors
  Containing 191 true matches and 168 true non-matches
    (53.20% true matches)
  Identified 338 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   324  (95.86%)
          2 :    11  (3.25%)
          3 :     2  (0.59%)
          7 :     1  (0.30%)

Identified 0 non-pure unique weight vectors (from 338 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.000 : 168

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 359
  Number of unique weight vectors: 338

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (338, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 338 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 338 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 43 matches and 32 non-matches
    Purity of oracle classification:  0.573
    Entropy of oracle classification: 0.984
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  32
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 263 weight vectors
  Based on 43 matches and 32 non-matches
  Classified 128 matches and 135 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (128, 0.5733333333333334, 0.9844268978000114, 0.5733333333333334)
    (135, 0.5733333333333334, 0.9844268978000114, 0.5733333333333334)

Current size of match and non-match training data sets: 43 / 32

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.98
- Size 135 weight vectors
- Estimated match proportion 0.573

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 135 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.146, 0.130, 0.176, 0.318, 0.167] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.821, 1.000, 0.275, 0.297, 0.227, 0.255, 0.152] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.600, 0.944, 0.250, 0.200, 0.186, 0.136, 0.118] (False)
    [0.881, 1.000, 0.211, 0.250, 0.129, 0.250, 0.211] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.750, 1.000, 0.214, 0.184, 0.250, 0.267, 0.111] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [0.750, 1.000, 0.243, 0.243, 0.214, 0.111, 0.132] (False)
    [0.929, 1.000, 0.250, 0.193, 0.250, 0.164, 0.213] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.635, 1.000, 0.179, 0.265, 0.167, 0.121, 0.241] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.592, 1.000, 0.179, 0.205, 0.156, 0.273, 0.180] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.902, 1.000, 0.182, 0.071, 0.182, 0.222, 0.190] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.663, 1.000, 0.132, 0.143, 0.241, 0.174, 0.167] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [0.747, 1.000, 0.231, 0.167, 0.107, 0.222, 0.125] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 3 matches and 53 non-matches
    Purity of oracle classification:  0.946
    Entropy of oracle classification: 0.301
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)339_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 339), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)339_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 890
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 890 weight vectors
  Containing 154 true matches and 736 true non-matches
    (17.30% true matches)
  Identified 854 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   826  (96.72%)
          2 :    25  (2.93%)
          3 :     2  (0.23%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 854 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 715

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 882
  Number of unique weight vectors: 853

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (853, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 853 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 853 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 767 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 45 matches and 722 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (45, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (722, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 45 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 29

Farthest first selection of 29 weight vectors from 45 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 1.000, 1.000, 0.952, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)

Perform oracle with 100.00 accuracy on 29 weight vectors
  The oracle will correctly classify 29 weight vectors and wrongly classify 0
  Classified 29 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 29 weight vectors (classified by oracle) from cluster

Cluster is pure enough and not too large, add its 45 weight vectors to:
  Match training set

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 3: Queue length: 1
  Number of manual oracle classifications performed: 115
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (722, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 69 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 722 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 722 vectors
  The selected farthest weight vectors are:
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 15 matches and 55 non-matches
    Purity of oracle classification:  0.786
    Entropy of oracle classification: 0.750
    Number of true matches:      15
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)997_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 997), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)997_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 697
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 697 weight vectors
  Containing 203 true matches and 494 true non-matches
    (29.12% true matches)
  Identified 646 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   612  (94.74%)
          2 :    31  (4.80%)
          3 :     2  (0.31%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 646 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 473

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 696
  Number of unique weight vectors: 646

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (646, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 646 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 646 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 563 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 153 matches and 410 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (410, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 410 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 410 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [1.000, 0.000, 0.700, 0.429, 0.476, 0.647, 0.810] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 2 matches and 69 non-matches
    Purity of oracle classification:  0.972
    Entropy of oracle classification: 0.185
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)73_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 73), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)73_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 118 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 118 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)330_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 330), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)330_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 586
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 586 weight vectors
  Containing 196 true matches and 390 true non-matches
    (33.45% true matches)
  Identified 562 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   549  (97.69%)
          2 :    10  (1.78%)
          3 :     2  (0.36%)
         11 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 562 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 389

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 585
  Number of unique weight vectors: 562

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (562, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 562 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 562 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 480 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 136 matches and 344 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (344, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 344 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 344 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.455, 0.714, 0.429, 0.550, 0.647] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 9 matches and 59 non-matches
    Purity of oracle classification:  0.868
    Entropy of oracle classification: 0.564
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)149_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 149), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)149_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 322
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 322 weight vectors
  Containing 208 true matches and 114 true non-matches
    (64.60% true matches)
  Identified 289 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   272  (94.12%)
          2 :    14  (4.84%)
          3 :     2  (0.69%)
         16 :     1  (0.35%)

Identified 1 non-pure unique weight vectors (from 289 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 111

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 321
  Number of unique weight vectors: 289

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (289, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 289 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 72

Perform initial selection using "far" method

Farthest first selection of 72 weight vectors from 289 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 34 matches and 38 non-matches
    Purity of oracle classification:  0.528
    Entropy of oracle classification: 0.998
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 217 weight vectors
  Based on 34 matches and 38 non-matches
  Classified 151 matches and 66 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 72
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.5277777777777778, 0.9977724720899821, 0.4722222222222222)
    (66, 0.5277777777777778, 0.9977724720899821, 0.4722222222222222)

Current size of match and non-match training data sets: 34 / 38

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 151 weight vectors
- Estimated match proportion 0.472

Sample size for this cluster: 59

Farthest first selection of 59 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 59 weight vectors
  The oracle will correctly classify 59 weight vectors and wrongly classify 0
  Classified 52 matches and 7 non-matches
    Purity of oracle classification:  0.881
    Entropy of oracle classification: 0.525
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 59 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)977_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 977), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)977_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)184_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (10, 1 - acm diverg, 184), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)184_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 745
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 745 weight vectors
  Containing 169 true matches and 576 true non-matches
    (22.68% true matches)
  Identified 708 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   677  (95.62%)
          2 :    28  (3.95%)
          3 :     2  (0.28%)
          6 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 708 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.000 : 556

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 745
  Number of unique weight vectors: 708

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (708, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 708 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 708 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 624 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 120 matches and 504 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (120, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (504, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 504 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 504 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 3 matches and 67 non-matches
    Purity of oracle classification:  0.957
    Entropy of oracle classification: 0.255
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(10)745_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 745), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)745_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 666
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 666 weight vectors
  Containing 181 true matches and 485 true non-matches
    (27.18% true matches)
  Identified 645 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   635  (98.45%)
          2 :     7  (1.09%)
          3 :     2  (0.31%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 645 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 160
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 484

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 665
  Number of unique weight vectors: 645

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (645, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 645 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 645 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 31 matches and 52 non-matches
    Purity of oracle classification:  0.627
    Entropy of oracle classification: 0.953
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 562 weight vectors
  Based on 31 matches and 52 non-matches
  Classified 295 matches and 267 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (295, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)
    (267, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)

Current size of match and non-match training data sets: 31 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 267 weight vectors
- Estimated match proportion 0.373

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 267 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.462, 0.667, 0.600, 0.389, 0.615] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.818, 0.762, 0.714, 0.500, 0.400] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.500, 0.739, 0.824, 0.591, 0.550] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.815, 0.643, 0.800, 0.750, 0.429] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 0 matches and 67 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(15)258_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 258), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)258_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 635
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 635 weight vectors
  Containing 212 true matches and 423 true non-matches
    (33.39% true matches)
  Identified 583 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   547  (93.83%)
          2 :    33  (5.66%)
          3 :     2  (0.34%)
         16 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 583 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 402

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 634
  Number of unique weight vectors: 583

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (583, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 583 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 583 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 501 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 151 matches and 350 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (350, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 350 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 350 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.714, 0.727, 0.750, 0.294, 0.833] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 7 matches and 61 non-matches
    Purity of oracle classification:  0.897
    Entropy of oracle classification: 0.478
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)284_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 284), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)284_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 640
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 640 weight vectors
  Containing 177 true matches and 463 true non-matches
    (27.66% true matches)
  Identified 601 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   571  (95.01%)
          2 :    27  (4.49%)
          3 :     2  (0.33%)
          9 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 601 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 158
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 442

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 631
  Number of unique weight vectors: 600

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (600, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 600 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 600 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 517 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 140 matches and 377 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (377, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 377 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 377 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.750, 0.524, 0.400, 0.813, 0.611] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.600, 0.857, 0.579, 0.286, 0.545] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.417, 0.750, 0.500, 0.455] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.857, 0.444, 0.556, 0.235, 0.500] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.714, 0.318, 0.583, 0.417, 0.727] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 1 matches and 69 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.108
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(20)101_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 101), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)101_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1073
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1073 weight vectors
  Containing 226 true matches and 847 true non-matches
    (21.06% true matches)
  Identified 1016 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   979  (96.36%)
          2 :    34  (3.35%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1016 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 826

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1072
  Number of unique weight vectors: 1016

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1016, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1016 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1016 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 929 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 332 matches and 597 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (332, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (597, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 332 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 332 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 40 matches and 30 non-matches
    Purity of oracle classification:  0.571
    Entropy of oracle classification: 0.985
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  30
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)238_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 238), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)238_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 830
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 830 weight vectors
  Containing 213 true matches and 617 true non-matches
    (25.66% true matches)
  Identified 776 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   741  (95.49%)
          2 :    32  (4.12%)
          3 :     2  (0.26%)
         19 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 776 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 596

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 829
  Number of unique weight vectors: 776

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (776, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 776 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 776 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 691 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 149 matches and 542 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (542, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 542 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 542 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 4 matches and 69 non-matches
    Purity of oracle classification:  0.945
    Entropy of oracle classification: 0.306
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)437_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976923
recall                 0.424749
f-measure              0.592075
da                          130
dm                            0
ndm                           0
tp                          127
fp                            3
tn                  4.76529e+07
fn                          172
Name: (10, 1 - acm diverg, 437), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)437_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 662
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 662 weight vectors
  Containing 137 true matches and 525 true non-matches
    (20.69% true matches)
  Identified 646 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   635  (98.30%)
          2 :     8  (1.24%)
          3 :     2  (0.31%)
          5 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 646 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 123
     0.000 : 523

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 662
  Number of unique weight vectors: 646

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (646, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 646 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 646 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 31 matches and 52 non-matches
    Purity of oracle classification:  0.627
    Entropy of oracle classification: 0.953
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 563 weight vectors
  Based on 31 matches and 52 non-matches
  Classified 84 matches and 479 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (84, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)
    (479, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)

Current size of match and non-match training data sets: 31 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 84 weight vectors
- Estimated match proportion 0.373

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 84 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 40 matches and 4 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

130.0
Analisando o arquivo: diverg(10)929_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 929), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)929_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 885
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 885 weight vectors
  Containing 177 true matches and 708 true non-matches
    (20.00% true matches)
  Identified 846 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   816  (96.45%)
          2 :    27  (3.19%)
          3 :     2  (0.24%)
          9 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 846 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 158
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 687

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 876
  Number of unique weight vectors: 845

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (845, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 845 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 845 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 759 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 140 matches and 619 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (619, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 619 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 619 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 1 matches and 73 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.103
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(15)351_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 351), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)351_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 848
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 848 weight vectors
  Containing 214 true matches and 634 true non-matches
    (25.24% true matches)
  Identified 794 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   759  (95.59%)
          2 :    32  (4.03%)
          3 :     2  (0.25%)
         19 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 794 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 847
  Number of unique weight vectors: 794

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (794, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 794 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 794 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.722, 0.471, 0.545, 0.579] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.538, 0.500, 0.818, 0.789, 0.750] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.300, 0.524, 0.727, 0.762] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 709 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 123 matches and 586 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (586, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 586 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 586 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [0.917, 0.000, 0.550, 0.455, 0.455, 0.000, 0.000] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.917, 0.818, 0.714, 0.611] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.833, 0.833, 0.550, 0.500, 0.688] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.783, 0.357, 0.750, 0.412, 0.238] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.636, 0.545, 0.368, 0.563, 0.462] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 17 matches and 56 non-matches
    Purity of oracle classification:  0.767
    Entropy of oracle classification: 0.783
    Number of true matches:      17
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)276_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 276), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)276_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)490_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 490), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)490_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 401
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 401 weight vectors
  Containing 217 true matches and 184 true non-matches
    (54.11% true matches)
  Identified 368 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   352  (95.65%)
          2 :    13  (3.53%)
          3 :     2  (0.54%)
         17 :     1  (0.27%)

Identified 1 non-pure unique weight vectors (from 368 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 183

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 400
  Number of unique weight vectors: 368

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (368, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 368 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 368 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 41 matches and 35 non-matches
    Purity of oracle classification:  0.539
    Entropy of oracle classification: 0.995
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  35
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 292 weight vectors
  Based on 41 matches and 35 non-matches
  Classified 148 matches and 144 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.5394736842105263, 0.9954993847275952, 0.5394736842105263)
    (144, 0.5394736842105263, 0.9954993847275952, 0.5394736842105263)

Current size of match and non-match training data sets: 41 / 35

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 1.00
- Size 148 weight vectors
- Estimated match proportion 0.539

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 46 matches and 12 non-matches
    Purity of oracle classification:  0.793
    Entropy of oracle classification: 0.736
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  12
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)832_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 832), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)832_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 407
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 407 weight vectors
  Containing 217 true matches and 190 true non-matches
    (53.32% true matches)
  Identified 370 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   352  (95.14%)
          2 :    15  (4.05%)
          3 :     2  (0.54%)
         19 :     1  (0.27%)

Identified 1 non-pure unique weight vectors (from 370 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 187

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 406
  Number of unique weight vectors: 370

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (370, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 370 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 370 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 29 matches and 47 non-matches
    Purity of oracle classification:  0.618
    Entropy of oracle classification: 0.959
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 294 weight vectors
  Based on 29 matches and 47 non-matches
  Classified 145 matches and 149 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.618421052631579, 0.959149554396894, 0.3815789473684211)
    (149, 0.618421052631579, 0.959149554396894, 0.3815789473684211)

Current size of match and non-match training data sets: 29 / 47

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 149 weight vectors
- Estimated match proportion 0.382

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 9 matches and 48 non-matches
    Purity of oracle classification:  0.842
    Entropy of oracle classification: 0.629
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)981_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 981), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)981_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 346
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 346 weight vectors
  Containing 212 true matches and 134 true non-matches
    (61.27% true matches)
  Identified 312 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   294  (94.23%)
          2 :    15  (4.81%)
          3 :     2  (0.64%)
         16 :     1  (0.32%)

Identified 1 non-pure unique weight vectors (from 312 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 131

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 345
  Number of unique weight vectors: 312

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (312, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 312 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 73

Perform initial selection using "far" method

Farthest first selection of 73 weight vectors from 312 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 35 matches and 38 non-matches
    Purity of oracle classification:  0.521
    Entropy of oracle classification: 0.999
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 239 weight vectors
  Based on 35 matches and 38 non-matches
  Classified 150 matches and 89 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 73
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.5205479452054794, 0.998781393072756, 0.4794520547945205)
    (89, 0.5205479452054794, 0.998781393072756, 0.4794520547945205)

Current size of match and non-match training data sets: 35 / 38

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 89 weight vectors
- Estimated match proportion 0.479

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 89 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 3 matches and 43 non-matches
    Purity of oracle classification:  0.935
    Entropy of oracle classification: 0.348
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)974_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979592
recall                  0.32107
f-measure              0.483627
da                           98
dm                            0
ndm                           0
tp                           96
fp                            2
tn                  4.76529e+07
fn                          203
Name: (10, 1 - acm diverg, 974), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)974_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 315
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 315 weight vectors
  Containing 159 true matches and 156 true non-matches
    (50.48% true matches)
  Identified 299 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   289  (96.66%)
          2 :     7  (2.34%)
          3 :     2  (0.67%)
          6 :     1  (0.33%)

Identified 0 non-pure unique weight vectors (from 299 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 143
     0.000 : 156

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 315
  Number of unique weight vectors: 299

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (299, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 299 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 73

Perform initial selection using "far" method

Farthest first selection of 73 weight vectors from 299 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 49 matches and 24 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  24
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 226 weight vectors
  Based on 49 matches and 24 non-matches
  Classified 226 matches and 0 non-matches

98.0
Analisando o arquivo: diverg(15)778_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 778), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)778_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.722, 0.471, 0.545, 0.579] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.538, 0.500, 0.818, 0.789, 0.750] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.300, 0.524, 0.727, 0.762] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 27 matches and 59 non-matches
    Purity of oracle classification:  0.686
    Entropy of oracle classification: 0.898
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 27 matches and 59 non-matches
  Classified 134 matches and 585 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (134, 0.686046511627907, 0.8976844934141643, 0.313953488372093)
    (585, 0.686046511627907, 0.8976844934141643, 0.313953488372093)

Current size of match and non-match training data sets: 27 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 585 weight vectors
- Estimated match proportion 0.314

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 585 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.800, 0.000, 0.625, 0.571, 0.467, 0.474, 0.667] (False)
    [1.000, 0.000, 0.350, 0.455, 0.625, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.625, 0.174, 0.333, 0.259, 0.286] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.857, 0.111, 0.444, 0.529, 0.500] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.875, 0.467, 0.471, 0.833, 0.571] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 15 matches and 57 non-matches
    Purity of oracle classification:  0.792
    Entropy of oracle classification: 0.738
    Number of true matches:      15
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)829_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 829), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)829_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 213 true matches and 586 true non-matches
    (26.66% true matches)
  Identified 747 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   712  (95.31%)
          2 :    32  (4.28%)
          3 :     2  (0.27%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 747 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 565

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 747

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (747, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 747 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 747 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 662 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 86 matches and 576 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (86, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (576, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 576 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 576 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 20 matches and 53 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      20
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)91_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 91), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)91_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 351
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 351 weight vectors
  Containing 172 true matches and 179 true non-matches
    (49.00% true matches)
  Identified 330 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   318  (96.36%)
          2 :     9  (2.73%)
          3 :     2  (0.61%)
          9 :     1  (0.30%)

Identified 1 non-pure unique weight vectors (from 330 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 153
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 176

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 342
  Number of unique weight vectors: 329

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (329, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 329 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 329 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 31 matches and 43 non-matches
    Purity of oracle classification:  0.581
    Entropy of oracle classification: 0.981
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 255 weight vectors
  Based on 31 matches and 43 non-matches
  Classified 125 matches and 130 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 74
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (125, 0.581081081081081, 0.9809470132751208, 0.4189189189189189)
    (130, 0.581081081081081, 0.9809470132751208, 0.4189189189189189)

Current size of match and non-match training data sets: 31 / 43

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 130 weight vectors
- Estimated match proportion 0.419

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [0.367, 1.000, 0.154, 0.174, 0.125, 0.240, 0.226] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.800, 0.636, 0.563, 0.545, 0.722] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 3 matches and 52 non-matches
    Purity of oracle classification:  0.945
    Entropy of oracle classification: 0.305
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(15)617_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 617), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)617_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 952
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 952 weight vectors
  Containing 201 true matches and 751 true non-matches
    (21.11% true matches)
  Identified 907 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   873  (96.25%)
          2 :    31  (3.42%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 907 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 730

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 951
  Number of unique weight vectors: 907

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (907, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 907 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 907 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 820 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 115 matches and 705 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (115, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (705, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 115 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 115 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 46 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)553_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 553), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)553_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1059
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1059 weight vectors
  Containing 219 true matches and 840 true non-matches
    (20.68% true matches)
  Identified 1003 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   967  (96.41%)
          2 :    33  (3.29%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1003 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 819

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1058
  Number of unique weight vectors: 1003

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1003, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1003 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1003 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 916 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 319 matches and 597 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (319, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (597, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 597 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 597 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.923, 0.667, 0.667, 0.412, 0.571] (False)
    [0.667, 0.000, 0.667, 0.500, 0.647, 0.556, 0.684] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.750, 0.429, 0.526, 0.500, 0.846] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.750, 0.524, 0.400, 0.813, 0.611] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 0.583, 0.444, 0.412, 0.318, 0.421] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.233, 0.545, 0.714, 0.455, 0.238] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.462, 0.889, 0.455, 0.211, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)861_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 861), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)861_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 901
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 901 weight vectors
  Containing 208 true matches and 693 true non-matches
    (23.09% true matches)
  Identified 848 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   813  (95.87%)
          2 :    32  (3.77%)
          3 :     2  (0.24%)
         18 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 848 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 672

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 900
  Number of unique weight vectors: 848

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (848, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 848 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 848 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 30 matches and 56 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 762 weight vectors
  Based on 30 matches and 56 non-matches
  Classified 192 matches and 570 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (192, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)
    (570, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)

Current size of match and non-match training data sets: 30 / 56

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 570 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 570 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.667, 0.444, 0.556, 0.222, 0.143] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)188_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 188), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)188_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 901
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 901 weight vectors
  Containing 213 true matches and 688 true non-matches
    (23.64% true matches)
  Identified 849 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   813  (95.76%)
          2 :    33  (3.89%)
          3 :     2  (0.24%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 849 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 667

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 900
  Number of unique weight vectors: 849

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (849, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 849 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 849 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 763 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 180 matches and 583 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (180, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (583, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 583 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 583 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.538, 0.789, 0.353, 0.545, 0.550] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.857, 0.417, 0.750, 0.500, 0.455] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)175_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 175), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)175_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 789
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 789 weight vectors
  Containing 225 true matches and 564 true non-matches
    (28.52% true matches)
  Identified 750 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   731  (97.47%)
          2 :    16  (2.13%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 750 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 561

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 788
  Number of unique weight vectors: 750

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (750, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 750 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 750 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 31 matches and 54 non-matches
    Purity of oracle classification:  0.635
    Entropy of oracle classification: 0.947
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 665 weight vectors
  Based on 31 matches and 54 non-matches
  Classified 150 matches and 515 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)
    (515, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)

Current size of match and non-match training data sets: 31 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.95
- Size 515 weight vectors
- Estimated match proportion 0.365

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 515 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.433, 0.667, 0.500, 0.636, 0.421] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.500, 0.600, 0.353, 0.611, 0.526] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 6 matches and 70 non-matches
    Purity of oracle classification:  0.921
    Entropy of oracle classification: 0.398
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)343_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (20, 1 - acm diverg, 343), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)343_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1026
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1026 weight vectors
  Containing 198 true matches and 828 true non-matches
    (19.30% true matches)
  Identified 984 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   949  (96.44%)
          2 :    32  (3.25%)
          3 :     2  (0.20%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 984 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 808

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 984

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (984, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 984 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 984 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 897 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 93 matches and 804 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (93, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (804, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 93 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 93 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)17_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 17), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)17_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 732
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 732 weight vectors
  Containing 219 true matches and 513 true non-matches
    (29.92% true matches)
  Identified 677 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   641  (94.68%)
          2 :    33  (4.87%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 677 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 492

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 731
  Number of unique weight vectors: 677

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (677, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 677 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 677 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 593 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 148 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (445, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 445 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 445 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 8 matches and 62 non-matches
    Purity of oracle classification:  0.886
    Entropy of oracle classification: 0.513
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)757_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 757), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)757_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 817 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 817 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 11 matches and 60 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.622
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)319_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 319), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)319_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1086
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1086 weight vectors
  Containing 214 true matches and 872 true non-matches
    (19.71% true matches)
  Identified 1032 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   997  (96.61%)
          2 :    32  (3.10%)
          3 :     2  (0.19%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1032 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 851

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1085
  Number of unique weight vectors: 1032

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1032, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1032 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1032 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 944 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 98 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 98 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 98 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 42 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)251_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 251), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)251_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 266
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 266 weight vectors
  Containing 209 true matches and 57 true non-matches
    (78.57% true matches)
  Identified 235 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   220  (93.62%)
          2 :    12  (5.11%)
          3 :     2  (0.85%)
         16 :     1  (0.43%)

Identified 1 non-pure unique weight vectors (from 235 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 56

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 265
  Number of unique weight vectors: 235

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (235, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 235 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 68

Perform initial selection using "far" method

Farthest first selection of 68 weight vectors from 235 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 38 matches and 30 non-matches
    Purity of oracle classification:  0.559
    Entropy of oracle classification: 0.990
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  30
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 167 weight vectors
  Based on 38 matches and 30 non-matches
  Classified 160 matches and 7 non-matches

  Non-match cluster not large enough for required sample size
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 1
  Number of manual oracle classifications performed: 68
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (160, 0.5588235294117647, 0.9899927915575188, 0.5588235294117647)

Current size of match and non-match training data sets: 38 / 30

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 160 weight vectors
- Estimated match proportion 0.559

Sample size for this cluster: 60

Farthest first selection of 60 weight vectors from 160 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 60 weight vectors
  The oracle will correctly classify 60 weight vectors and wrongly classify 0
  Classified 44 matches and 16 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.837
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  16
    Number of false non-matches: 0

Deleted 60 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)239_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (15, 1 - acm diverg, 239), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)239_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 997
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 997 weight vectors
  Containing 202 true matches and 795 true non-matches
    (20.26% true matches)
  Identified 947 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   913  (96.41%)
          2 :    31  (3.27%)
          3 :     2  (0.21%)
         16 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 947 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 774

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 996
  Number of unique weight vectors: 947

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (947, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 947 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 947 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 860 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 131 matches and 729 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (729, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 729 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 729 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 9 matches and 63 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(20)33_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 33), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)33_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 29 matches and 59 non-matches
    Purity of oracle classification:  0.670
    Entropy of oracle classification: 0.914
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 29 matches and 59 non-matches
  Classified 159 matches and 779 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)
    (779, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)

Current size of match and non-match training data sets: 29 / 59

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 159 weight vectors
- Estimated match proportion 0.330

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 159 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 47 matches and 8 non-matches
    Purity of oracle classification:  0.855
    Entropy of oracle classification: 0.598
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)668_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 668), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)668_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 29 matches and 59 non-matches
    Purity of oracle classification:  0.670
    Entropy of oracle classification: 0.914
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 29 matches and 59 non-matches
  Classified 162 matches and 777 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)
    (777, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)

Current size of match and non-match training data sets: 29 / 59

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 162 weight vectors
- Estimated match proportion 0.330

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)634_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (15, 1 - acm diverg, 634), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)634_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 649
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 649 weight vectors
  Containing 198 true matches and 451 true non-matches
    (30.51% true matches)
  Identified 620 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   607  (97.90%)
          2 :    10  (1.61%)
          3 :     2  (0.32%)
         16 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 620 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 450

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 648
  Number of unique weight vectors: 620

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (620, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 620 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 620 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 33 matches and 50 non-matches
    Purity of oracle classification:  0.602
    Entropy of oracle classification: 0.970
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 537 weight vectors
  Based on 33 matches and 50 non-matches
  Classified 155 matches and 382 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6024096385542169, 0.9695235828220428, 0.39759036144578314)
    (382, 0.6024096385542169, 0.9695235828220428, 0.39759036144578314)

Current size of match and non-match training data sets: 33 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 382 weight vectors
- Estimated match proportion 0.398

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 382 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.409, 0.654, 0.500, 0.516, 0.333] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.817, 1.000, 0.250, 0.212, 0.256, 0.045, 0.250] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.522, 0.786, 0.800, 0.824, 0.667] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.632, 0.750, 0.696, 0.682, 0.450] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [0.790, 0.000, 0.636, 0.619, 0.429, 0.450, 0.609] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.786, 0.833, 0.545, 0.478, 0.346] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 0.000, 0.667, 0.722, 0.353, 0.545, 0.800] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 0 matches and 74 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(15)609_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 609), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)609_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 754
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 754 weight vectors
  Containing 222 true matches and 532 true non-matches
    (29.44% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   699  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 529

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 753
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 146 matches and 488 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (488, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 146 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 53 matches and 2 non-matches
    Purity of oracle classification:  0.964
    Entropy of oracle classification: 0.225
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)265_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 265), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)265_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)709_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 709), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)709_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 857
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 857 weight vectors
  Containing 187 true matches and 670 true non-matches
    (21.82% true matches)
  Identified 817 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   783  (95.84%)
          2 :    31  (3.79%)
          3 :     2  (0.24%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 817 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 650

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 857
  Number of unique weight vectors: 817

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (817, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 817 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 817 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 731 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 150 matches and 581 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (581, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 581 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 581 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 0 matches and 74 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)777_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 777), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)777_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1082
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1082 weight vectors
  Containing 209 true matches and 873 true non-matches
    (19.32% true matches)
  Identified 1035 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1000  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1035 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1081
  Number of unique weight vectors: 1035

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1035, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1035 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1035 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 947 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 846 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 846 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)75_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 75), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)75_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 768
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 768 weight vectors
  Containing 222 true matches and 546 true non-matches
    (28.91% true matches)
  Identified 714 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   677  (94.82%)
          2 :    34  (4.76%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 714 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 525

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 767
  Number of unique weight vectors: 714

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (714, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 714 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 714 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 630 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 149 matches and 481 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (481, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 481 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 481 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.714, 0.727, 0.750, 0.294, 0.833] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 9 matches and 62 non-matches
    Purity of oracle classification:  0.873
    Entropy of oracle classification: 0.548
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)691_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 691), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)691_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)406_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 406), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)406_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1044
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1044 weight vectors
  Containing 225 true matches and 819 true non-matches
    (21.55% true matches)
  Identified 987 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   950  (96.25%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 987 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 798

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1043
  Number of unique weight vectors: 987

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (987, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 987 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 987 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 32 matches and 55 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 900 weight vectors
  Based on 32 matches and 55 non-matches
  Classified 329 matches and 571 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (329, 0.632183908045977, 0.9489804585630242, 0.367816091954023)
    (571, 0.632183908045977, 0.9489804585630242, 0.367816091954023)

Current size of match and non-match training data sets: 32 / 55

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 571 weight vectors
- Estimated match proportion 0.368

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 571 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)421_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (15, 1 - acm diverg, 421), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)421_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 898
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 898 weight vectors
  Containing 155 true matches and 743 true non-matches
    (17.26% true matches)
  Identified 862 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   834  (96.75%)
          2 :    25  (2.90%)
          3 :     2  (0.23%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 862 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 139
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 722

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 890
  Number of unique weight vectors: 861

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (861, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 861 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 861 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 775 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 87 matches and 688 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (87, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (688, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 87 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 87 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.952, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 41 matches and 1 non-matches
    Purity of oracle classification:  0.976
    Entropy of oracle classification: 0.162
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(20)395_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 395), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)395_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 177 matches and 761 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (177, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (761, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 177 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 177 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 44 matches and 14 non-matches
    Purity of oracle classification:  0.759
    Entropy of oracle classification: 0.797
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  14
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)208_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 208), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)208_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 714
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 714 weight vectors
  Containing 201 true matches and 513 true non-matches
    (28.15% true matches)
  Identified 669 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   635  (94.92%)
          2 :    31  (4.63%)
          3 :     2  (0.30%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 669 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 492

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 713
  Number of unique weight vectors: 669

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (669, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 669 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 669 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 585 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 123 matches and 462 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (462, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 123 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)29_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 29), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)29_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 297
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 297 weight vectors
  Containing 183 true matches and 114 true non-matches
    (61.62% true matches)
  Identified 276 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   261  (94.57%)
          2 :    12  (4.35%)
          3 :     2  (0.72%)
          6 :     1  (0.36%)

Identified 0 non-pure unique weight vectors (from 276 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.000 : 112

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 297
  Number of unique weight vectors: 276

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (276, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 276 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Perform initial selection using "far" method

Farthest first selection of 71 weight vectors from 276 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 36 matches and 35 non-matches
    Purity of oracle classification:  0.507
    Entropy of oracle classification: 1.000
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  35
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 205 weight vectors
  Based on 36 matches and 35 non-matches
  Classified 140 matches and 65 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 71
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.5070422535211268, 0.9998568991526107, 0.5070422535211268)
    (65, 0.5070422535211268, 0.9998568991526107, 0.5070422535211268)

Current size of match and non-match training data sets: 36 / 35

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 65 weight vectors
- Estimated match proportion 0.507

Sample size for this cluster: 39

Farthest first selection of 39 weight vectors from 65 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)

Perform oracle with 100.00 accuracy on 39 weight vectors
  The oracle will correctly classify 39 weight vectors and wrongly classify 0
  Classified 0 matches and 39 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 39 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)367_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 367), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)367_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 810
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 810 weight vectors
  Containing 223 true matches and 587 true non-matches
    (27.53% true matches)
  Identified 756 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   719  (95.11%)
          2 :    34  (4.50%)
          3 :     2  (0.26%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 756 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 566

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 809
  Number of unique weight vectors: 756

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (756, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 756 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 671 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 94 matches and 577 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (577, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 577 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 577 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 20 matches and 53 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      20
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)844_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 844), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)844_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1035
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1035 weight vectors
  Containing 223 true matches and 812 true non-matches
    (21.55% true matches)
  Identified 981 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   944  (96.23%)
          2 :    34  (3.47%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 981 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 791

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1034
  Number of unique weight vectors: 981

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (981, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 981 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 981 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 894 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 160 matches and 734 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (160, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (734, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 734 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 734 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 3 matches and 74 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.238
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)351_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 351), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)351_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 689
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 689 weight vectors
  Containing 219 true matches and 470 true non-matches
    (31.79% true matches)
  Identified 656 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   640  (97.56%)
          2 :    13  (1.98%)
          3 :     2  (0.30%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 656 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 469

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 688
  Number of unique weight vectors: 656

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (656, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 656 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 656 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 572 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 145 matches and 427 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (427, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 145 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)761_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 761), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)761_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)778_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 778), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)778_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)592_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 592), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)592_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 831
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 831 weight vectors
  Containing 227 true matches and 604 true non-matches
    (27.32% true matches)
  Identified 774 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   737  (95.22%)
          2 :    34  (4.39%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 774 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 583

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 830
  Number of unique weight vectors: 774

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (774, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 774 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 774 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 689 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 151 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (538, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 538 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 538 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 9 matches and 64 non-matches
    Purity of oracle classification:  0.877
    Entropy of oracle classification: 0.539
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)642_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 642), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)642_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(10)567_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 567), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)567_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 583
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 583 weight vectors
  Containing 187 true matches and 396 true non-matches
    (32.08% true matches)
  Identified 561 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   545  (97.15%)
          2 :    13  (2.32%)
          3 :     2  (0.36%)
          6 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 561 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 394

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 583
  Number of unique weight vectors: 561

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (561, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 561 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 561 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.750, 0.905, 0.667, 0.500, 0.571] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 479 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 133 matches and 346 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (346, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 346 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 346 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.846, 0.542, 0.588, 0.579, 0.423] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.500, 0.667, 0.571, 0.500, 0.625] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.367, 0.667, 0.583, 0.625, 0.316] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 7 matches and 63 non-matches
    Purity of oracle classification:  0.900
    Entropy of oracle classification: 0.469
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(15)199_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 199), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)199_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 647
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 647 weight vectors
  Containing 215 true matches and 432 true non-matches
    (33.23% true matches)
  Identified 595 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   559  (93.95%)
          2 :    33  (5.55%)
          3 :     2  (0.34%)
         16 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 595 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 411

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 646
  Number of unique weight vectors: 595

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (595, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 595 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 595 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 28 matches and 54 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 513 weight vectors
  Based on 28 matches and 54 non-matches
  Classified 146 matches and 367 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)
    (367, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)

Current size of match and non-match training data sets: 28 / 54

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 367 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 367 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.714, 0.727, 0.750, 0.294, 0.833] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 10 matches and 60 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)582_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 582), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)582_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 452
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 452 weight vectors
  Containing 221 true matches and 231 true non-matches
    (48.89% true matches)
  Identified 416 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   400  (96.15%)
          2 :    13  (3.12%)
          3 :     2  (0.48%)
         20 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 416 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 230

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 451
  Number of unique weight vectors: 416

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (416, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 416 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 416 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 35 matches and 43 non-matches
    Purity of oracle classification:  0.551
    Entropy of oracle classification: 0.992
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 338 weight vectors
  Based on 35 matches and 43 non-matches
  Classified 146 matches and 192 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.5512820512820513, 0.9923985003332222, 0.44871794871794873)
    (192, 0.5512820512820513, 0.9923985003332222, 0.44871794871794873)

Current size of match and non-match training data sets: 35 / 43

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 192 weight vectors
- Estimated match proportion 0.449

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 192 vectors
  The selected farthest weight vectors are:
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.929, 1.000, 0.182, 0.238, 0.188, 0.146, 0.270] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 8 matches and 56 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)103_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 103), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)103_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 417
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 417 weight vectors
  Containing 200 true matches and 217 true non-matches
    (47.96% true matches)
  Identified 391 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   377  (96.42%)
          2 :    11  (2.81%)
          3 :     2  (0.51%)
         12 :     1  (0.26%)

Identified 1 non-pure unique weight vectors (from 391 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 216

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 416
  Number of unique weight vectors: 391

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (391, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 391 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 391 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 39 matches and 38 non-matches
    Purity of oracle classification:  0.506
    Entropy of oracle classification: 1.000
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 314 weight vectors
  Based on 39 matches and 38 non-matches
  Classified 133 matches and 181 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.5064935064935064, 0.9998783322990061, 0.5064935064935064)
    (181, 0.5064935064935064, 0.9998783322990061, 0.5064935064935064)

Current size of match and non-match training data sets: 39 / 38

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 181 weight vectors
- Estimated match proportion 0.506

Sample size for this cluster: 63

Farthest first selection of 63 weight vectors from 181 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.717, 1.000, 0.240, 0.231, 0.065, 0.192, 0.184] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.740, 1.000, 0.261, 0.273, 0.186, 0.171, 0.095] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 63 weight vectors
  The oracle will correctly classify 63 weight vectors and wrongly classify 0
  Classified 8 matches and 55 non-matches
    Purity of oracle classification:  0.873
    Entropy of oracle classification: 0.549
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 63 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)193_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 193), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)193_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)474_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 474), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)474_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)949_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 949), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)949_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 565
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 565 weight vectors
  Containing 147 true matches and 418 true non-matches
    (26.02% true matches)
  Identified 548 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   539  (98.36%)
          2 :     6  (1.09%)
          3 :     2  (0.36%)
          8 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 548 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 132
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 415

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 557
  Number of unique weight vectors: 547

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (547, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 547 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 547 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 31 matches and 50 non-matches
    Purity of oracle classification:  0.617
    Entropy of oracle classification: 0.960
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 466 weight vectors
  Based on 31 matches and 50 non-matches
  Classified 107 matches and 359 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (107, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)
    (359, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)

Current size of match and non-match training data sets: 31 / 50

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 107 weight vectors
- Estimated match proportion 0.383

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 107 vectors
  The selected farthest weight vectors are:
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 42 matches and 7 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(20)458_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 458), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)458_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(15)995_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 995), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)995_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 861
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 861 weight vectors
  Containing 227 true matches and 634 true non-matches
    (26.36% true matches)
  Identified 804 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   767  (95.40%)
          2 :    34  (4.23%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 804 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 860
  Number of unique weight vectors: 804

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (804, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 804 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 804 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 718 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 565 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (565, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 565 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 565 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)243_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (15, 1 - acm diverg, 243), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)243_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 689
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 689 weight vectors
  Containing 167 true matches and 522 true non-matches
    (24.24% true matches)
  Identified 670 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   657  (98.06%)
          2 :    10  (1.49%)
          3 :     2  (0.30%)
          6 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 670 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 150
     0.000 : 520

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 689
  Number of unique weight vectors: 670

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (670, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 670 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 670 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 586 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 113 matches and 473 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (113, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (473, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 473 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 473 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 5 matches and 68 non-matches
    Purity of oracle classification:  0.932
    Entropy of oracle classification: 0.360
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(20)519_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 519), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)519_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 845
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 845 weight vectors
  Containing 227 true matches and 618 true non-matches
    (26.86% true matches)
  Identified 788 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   751  (95.30%)
          2 :    34  (4.31%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 788 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 597

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 844
  Number of unique weight vectors: 788

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (788, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 788 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 788 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 703 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 162 matches and 541 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (541, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 162 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)823_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 823), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)823_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 543 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 543 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 12 matches and 61 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.645
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)852_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 852), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)852_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 793 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)437_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 437), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)437_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 700
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 700 weight vectors
  Containing 214 true matches and 486 true non-matches
    (30.57% true matches)
  Identified 665 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   650  (97.74%)
          2 :    12  (1.80%)
          3 :     2  (0.30%)
         20 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 665 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 485

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 699
  Number of unique weight vectors: 665

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (665, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 665 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 665 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 581 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 314 matches and 267 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (314, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (267, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 314 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 314 vectors
  The selected farthest weight vectors are:
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.890, 1.000, 0.281, 0.136, 0.183, 0.250, 0.163] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 42 matches and 28 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)14_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 14), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)14_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 950 non-matches

46.0
Analisando o arquivo: diverg(10)303_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 303), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)303_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 691
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 691 weight vectors
  Containing 191 true matches and 500 true non-matches
    (27.64% true matches)
  Identified 667 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   650  (97.45%)
          2 :    14  (2.10%)
          3 :     2  (0.30%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 667 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.000 : 498

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 691
  Number of unique weight vectors: 667

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (667, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 667 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 667 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 34 matches and 50 non-matches
    Purity of oracle classification:  0.595
    Entropy of oracle classification: 0.974
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 583 weight vectors
  Based on 34 matches and 50 non-matches
  Classified 274 matches and 309 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (274, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)
    (309, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)

Current size of match and non-match training data sets: 34 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 309 weight vectors
- Estimated match proportion 0.405

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 309 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.455, 0.714, 0.429, 0.550, 0.647] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 0.864, 0.667, 0.435, 0.700, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.750, 0.905, 0.667, 0.500, 0.571] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.846, 0.737, 0.706, 0.583, 0.800] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(15)475_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 475), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)475_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 943
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 943 weight vectors
  Containing 199 true matches and 744 true non-matches
    (21.10% true matches)
  Identified 898 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   864  (96.21%)
          2 :    31  (3.45%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 898 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 723

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 942
  Number of unique weight vectors: 898

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (898, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 898 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 898 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 812 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 123 matches and 689 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (689, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 123 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)977_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 977), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)977_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 375
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 375 weight vectors
  Containing 195 true matches and 180 true non-matches
    (52.00% true matches)
  Identified 348 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   332  (95.40%)
          2 :    13  (3.74%)
          3 :     2  (0.57%)
         11 :     1  (0.29%)

Identified 1 non-pure unique weight vectors (from 348 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 177

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 374
  Number of unique weight vectors: 348

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (348, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 348 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 348 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 31 matches and 44 non-matches
    Purity of oracle classification:  0.587
    Entropy of oracle classification: 0.978
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 273 weight vectors
  Based on 31 matches and 44 non-matches
  Classified 144 matches and 129 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.5866666666666667, 0.9782176659354248, 0.41333333333333333)
    (129, 0.5866666666666667, 0.9782176659354248, 0.41333333333333333)

Current size of match and non-match training data sets: 31 / 44

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 144 weight vectors
- Estimated match proportion 0.413

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 144 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 50 matches and 7 non-matches
    Purity of oracle classification:  0.877
    Entropy of oracle classification: 0.537
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)134_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 134), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)134_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1001
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1001 weight vectors
  Containing 198 true matches and 803 true non-matches
    (19.78% true matches)
  Identified 959 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   924  (96.35%)
          2 :    32  (3.34%)
          3 :     2  (0.21%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 959 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 783

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1001
  Number of unique weight vectors: 959

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (959, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 959 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 959 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 872 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 106 matches and 766 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (766, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 106 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 106 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [0.511, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 46 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)372_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 372), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)372_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 945
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 945 weight vectors
  Containing 153 true matches and 792 true non-matches
    (16.19% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   877  (96.59%)
          2 :    28  (3.08%)
          3 :     2  (0.22%)
          6 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 136
     0.000 : 772

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 945
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 248 matches and 573 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (248, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (573, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 248 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 248 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.663, 1.000, 0.132, 0.143, 0.241, 0.174, 0.167] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 42 matches and 22 non-matches
    Purity of oracle classification:  0.656
    Entropy of oracle classification: 0.928
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  22
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(10)887_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 887), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)887_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 583
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 583 weight vectors
  Containing 208 true matches and 375 true non-matches
    (35.68% true matches)
  Identified 550 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   533  (96.91%)
          2 :    14  (2.55%)
          3 :     2  (0.36%)
         16 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 550 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 372

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 582
  Number of unique weight vectors: 550

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (550, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 550 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 550 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 468 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 151 matches and 317 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (317, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 151 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 51 matches and 5 non-matches
    Purity of oracle classification:  0.911
    Entropy of oracle classification: 0.434
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)312_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (20, 1 - acm diverg, 312), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)312_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1061
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1061 weight vectors
  Containing 188 true matches and 873 true non-matches
    (17.72% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   988  (96.96%)
          2 :    28  (2.75%)
          3 :     2  (0.20%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1060
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 75 matches and 857 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (75, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (857, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 75 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 38

Farthest first selection of 38 weight vectors from 75 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)

Perform oracle with 100.00 accuracy on 38 weight vectors
  The oracle will correctly classify 38 weight vectors and wrongly classify 0
  Classified 38 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 38 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(10)24_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 24), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)24_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 866
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 866 weight vectors
  Containing 154 true matches and 712 true non-matches
    (17.78% true matches)
  Identified 830 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   802  (96.63%)
          2 :    25  (3.01%)
          3 :     2  (0.24%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 830 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 691

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 858
  Number of unique weight vectors: 829

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (829, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 829 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 829 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 743 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 60 matches and 683 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (60, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (683, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 60 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 34

Farthest first selection of 34 weight vectors from 60 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 1.000, 0.952, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.420, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)

Perform oracle with 100.00 accuracy on 34 weight vectors
  The oracle will correctly classify 34 weight vectors and wrongly classify 0
  Classified 33 matches and 1 non-matches
    Purity of oracle classification:  0.971
    Entropy of oracle classification: 0.191
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 34 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)690_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 690), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)690_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 803
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 803 weight vectors
  Containing 208 true matches and 595 true non-matches
    (25.90% true matches)
  Identified 756 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   721  (95.37%)
          2 :    32  (4.23%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 756 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 574

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 802
  Number of unique weight vectors: 756

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (756, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 756 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 671 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 138 matches and 533 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (533, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 533 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 533 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 13 matches and 58 non-matches
    Purity of oracle classification:  0.817
    Entropy of oracle classification: 0.687
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)438_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 438), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)438_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 621
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 621 weight vectors
  Containing 160 true matches and 461 true non-matches
    (25.76% true matches)
  Identified 587 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   557  (94.89%)
          2 :    27  (4.60%)
          3 :     2  (0.34%)
          4 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 587 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 146
     0.000 : 441

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 621
  Number of unique weight vectors: 587

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (587, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 587 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 587 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 25 matches and 57 non-matches
    Purity of oracle classification:  0.695
    Entropy of oracle classification: 0.887
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 505 weight vectors
  Based on 25 matches and 57 non-matches
  Classified 97 matches and 408 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (97, 0.6951219512195121, 0.8871723027673717, 0.3048780487804878)
    (408, 0.6951219512195121, 0.8871723027673717, 0.3048780487804878)

Current size of match and non-match training data sets: 25 / 57

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.89
- Size 408 weight vectors
- Estimated match proportion 0.305

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 408 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(20)752_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 752), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)752_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1086
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1086 weight vectors
  Containing 220 true matches and 866 true non-matches
    (20.26% true matches)
  Identified 1030 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   994  (96.50%)
          2 :    33  (3.20%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1030 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1085
  Number of unique weight vectors: 1030

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1030, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1030 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1030 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 942 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 125 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (125, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 125 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 125 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)480_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 480), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)480_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 421
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 421 weight vectors
  Containing 184 true matches and 237 true non-matches
    (43.71% true matches)
  Identified 400 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   390  (97.50%)
          2 :     7  (1.75%)
          3 :     2  (0.50%)
         11 :     1  (0.25%)

Identified 1 non-pure unique weight vectors (from 400 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 163
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 236

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 420
  Number of unique weight vectors: 400

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (400, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 400 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 400 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 34 matches and 43 non-matches
    Purity of oracle classification:  0.558
    Entropy of oracle classification: 0.990
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 323 weight vectors
  Based on 34 matches and 43 non-matches
  Classified 123 matches and 200 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.5584415584415584, 0.9901226308935799, 0.44155844155844154)
    (200, 0.5584415584415584, 0.9901226308935799, 0.44155844155844154)

Current size of match and non-match training data sets: 34 / 43

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 200 weight vectors
- Estimated match proportion 0.442

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 200 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 9 matches and 55 non-matches
    Purity of oracle classification:  0.859
    Entropy of oracle classification: 0.586
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)37_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 37), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)37_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)687_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 687), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)687_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1068
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1068 weight vectors
  Containing 226 true matches and 842 true non-matches
    (21.16% true matches)
  Identified 1011 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   974  (96.34%)
          2 :    34  (3.36%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1011 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 821

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1067
  Number of unique weight vectors: 1011

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1011, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1011 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1011 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 924 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 131 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (793, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 131 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)990_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 990), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)990_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 224 true matches and 575 true non-matches
    (28.04% true matches)
  Identified 760 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   741  (97.50%)
          2 :    16  (2.11%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 760 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 572

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 760

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (760, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 760 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 675 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 149 matches and 526 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (526, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 526 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 526 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 7 matches and 67 non-matches
    Purity of oracle classification:  0.905
    Entropy of oracle classification: 0.452
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)965_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 965), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)965_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 213 true matches and 595 true non-matches
    (26.36% true matches)
  Identified 754 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   719  (95.36%)
          2 :    32  (4.24%)
          3 :     2  (0.27%)
         19 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 754 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 574

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 754

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (754, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 754 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 754 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 669 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 142 matches and 527 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (527, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 142 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)372_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 372), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)372_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 831
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 831 weight vectors
  Containing 227 true matches and 604 true non-matches
    (27.32% true matches)
  Identified 774 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   737  (95.22%)
          2 :    34  (4.39%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 774 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 583

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 830
  Number of unique weight vectors: 774

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (774, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 774 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 774 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 689 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 151 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (538, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 538 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 538 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 9 matches and 64 non-matches
    Purity of oracle classification:  0.877
    Entropy of oracle classification: 0.539
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)928_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984375
recall                 0.210702
f-measure              0.347107
da                           64
dm                            0
ndm                           0
tp                           63
fp                            1
tn                  4.76529e+07
fn                          236
Name: (10, 1 - acm diverg, 928), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)928_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 696
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 696 weight vectors
  Containing 198 true matches and 498 true non-matches
    (28.45% true matches)
  Identified 664 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   648  (97.59%)
          2 :    13  (1.96%)
          3 :     2  (0.30%)
         16 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 664 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 495

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 695
  Number of unique weight vectors: 664

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (664, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 664 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 664 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 35 matches and 49 non-matches
    Purity of oracle classification:  0.583
    Entropy of oracle classification: 0.980
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 580 weight vectors
  Based on 35 matches and 49 non-matches
  Classified 270 matches and 310 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (270, 0.5833333333333334, 0.9798687566511527, 0.4166666666666667)
    (310, 0.5833333333333334, 0.9798687566511527, 0.4166666666666667)

Current size of match and non-match training data sets: 35 / 49

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 270 weight vectors
- Estimated match proportion 0.417

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 270 vectors
  The selected farthest weight vectors are:
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 45 matches and 24 non-matches
    Purity of oracle classification:  0.652
    Entropy of oracle classification: 0.932
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  24
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

64.0
Analisando o arquivo: diverg(15)196_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 196), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)196_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 759
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 759 weight vectors
  Containing 185 true matches and 574 true non-matches
    (24.37% true matches)
  Identified 735 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   722  (98.23%)
          2 :    10  (1.36%)
          3 :     2  (0.27%)
         11 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 735 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 163
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 571

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 758
  Number of unique weight vectors: 735

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (735, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 735 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 735 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 650 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 126 matches and 524 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (126, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (524, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 524 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 524 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.522, 0.786, 0.800, 0.824, 0.667] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 6 matches and 69 non-matches
    Purity of oracle classification:  0.920
    Entropy of oracle classification: 0.402
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(15)607_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 607), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)607_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 837
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 837 weight vectors
  Containing 220 true matches and 617 true non-matches
    (26.28% true matches)
  Identified 781 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   745  (95.39%)
          2 :    33  (4.23%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 781 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 596

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 836
  Number of unique weight vectors: 781

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (781, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 781 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 781 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 696 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 154 matches and 542 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (154, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (542, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 154 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 154 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 47 matches and 8 non-matches
    Purity of oracle classification:  0.855
    Entropy of oracle classification: 0.598
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)974_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 974), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)974_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 781
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 781 weight vectors
  Containing 206 true matches and 575 true non-matches
    (26.38% true matches)
  Identified 752 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   735  (97.74%)
          2 :    14  (1.86%)
          3 :     2  (0.27%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 752 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 572

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 780
  Number of unique weight vectors: 752

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (752, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 752 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 752 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 667 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 137 matches and 530 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (530, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 137 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 137 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 51 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.232
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)535_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 535), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)535_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 563
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 563 weight vectors
  Containing 147 true matches and 416 true non-matches
    (26.11% true matches)
  Identified 546 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   537  (98.35%)
          2 :     6  (1.10%)
          3 :     2  (0.37%)
          8 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 546 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 132
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 413

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 555
  Number of unique weight vectors: 545

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (545, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 545 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 545 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.538, 0.789, 0.353, 0.545, 0.550] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 31 matches and 50 non-matches
    Purity of oracle classification:  0.617
    Entropy of oracle classification: 0.960
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 464 weight vectors
  Based on 31 matches and 50 non-matches
  Classified 107 matches and 357 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (107, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)
    (357, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)

Current size of match and non-match training data sets: 31 / 50

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 357 weight vectors
- Estimated match proportion 0.383

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 357 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.778, 0.429, 0.571, 0.750, 0.600] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.889, 0.000, 0.714, 0.700, 0.500, 0.636, 0.765] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 0.000, 0.750, 0.905, 0.667, 0.500, 0.571] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 1 matches and 71 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.106
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(15)436_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 436), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)436_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 407
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 407 weight vectors
  Containing 217 true matches and 190 true non-matches
    (53.32% true matches)
  Identified 370 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   352  (95.14%)
          2 :    15  (4.05%)
          3 :     2  (0.54%)
         19 :     1  (0.27%)

Identified 1 non-pure unique weight vectors (from 370 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 187

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 406
  Number of unique weight vectors: 370

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (370, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 370 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 370 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 29 matches and 47 non-matches
    Purity of oracle classification:  0.618
    Entropy of oracle classification: 0.959
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 294 weight vectors
  Based on 29 matches and 47 non-matches
  Classified 145 matches and 149 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.618421052631579, 0.959149554396894, 0.3815789473684211)
    (149, 0.618421052631579, 0.959149554396894, 0.3815789473684211)

Current size of match and non-match training data sets: 29 / 47

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 149 weight vectors
- Estimated match proportion 0.382

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 9 matches and 48 non-matches
    Purity of oracle classification:  0.842
    Entropy of oracle classification: 0.629
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)318_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 318), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)318_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 179 matches and 760 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (760, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 179 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 179 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 43 matches and 15 non-matches
    Purity of oracle classification:  0.741
    Entropy of oracle classification: 0.825
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  15
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)710_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 710), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)710_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 712
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 712 weight vectors
  Containing 201 true matches and 511 true non-matches
    (28.23% true matches)
  Identified 667 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   633  (94.90%)
          2 :    31  (4.65%)
          3 :     2  (0.30%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 667 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 490

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 711
  Number of unique weight vectors: 667

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (667, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 667 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 667 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 583 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 123 matches and 460 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (460, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 123 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)499_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 499), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)499_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)800_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 800), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)800_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)406_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 406), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)406_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 469
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 469 weight vectors
  Containing 167 true matches and 302 true non-matches
    (35.61% true matches)
  Identified 452 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   441  (97.57%)
          2 :     8  (1.77%)
          3 :     2  (0.44%)
          6 :     1  (0.22%)

Identified 0 non-pure unique weight vectors (from 452 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 150
     0.000 : 302

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 469
  Number of unique weight vectors: 452

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (452, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 452 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 452 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.462, 0.609, 0.643, 0.706, 0.786] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.826, 0.429, 0.538, 0.636] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 26 matches and 53 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 373 weight vectors
  Based on 26 matches and 53 non-matches
  Classified 131 matches and 242 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.6708860759493671, 0.9140185106642176, 0.3291139240506329)
    (242, 0.6708860759493671, 0.9140185106642176, 0.3291139240506329)

Current size of match and non-match training data sets: 26 / 53

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 242 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 63

Farthest first selection of 63 weight vectors from 242 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.818, 0.762, 0.714, 0.500, 0.400] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.815, 0.643, 0.800, 0.750, 0.429] (False)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.571, 0.867, 0.471, 0.583, 0.643] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.611, 0.000, 0.800, 0.684, 0.500, 0.778, 0.609] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.682, 0.667, 0.286, 0.700, 0.533] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 63 weight vectors
  The oracle will correctly classify 63 weight vectors and wrongly classify 0
  Classified 0 matches and 63 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 63 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(10)125_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 125), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)125_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 602
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 602 weight vectors
  Containing 172 true matches and 430 true non-matches
    (28.57% true matches)
  Identified 582 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   571  (98.11%)
          2 :     8  (1.37%)
          3 :     2  (0.34%)
          9 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 582 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 154
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 427

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 593
  Number of unique weight vectors: 581

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (581, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 581 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 581 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 499 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 126 matches and 373 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (126, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (373, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 126 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 126 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 45 matches and 7 non-matches
    Purity of oracle classification:  0.865
    Entropy of oracle classification: 0.570
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(10)879_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 879), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)879_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 522
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 522 weight vectors
  Containing 188 true matches and 334 true non-matches
    (36.02% true matches)
  Identified 496 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   484  (97.58%)
          2 :     9  (1.81%)
          3 :     2  (0.40%)
         14 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 496 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 333

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 521
  Number of unique weight vectors: 496

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (496, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 496 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 496 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 31 matches and 49 non-matches
    Purity of oracle classification:  0.613
    Entropy of oracle classification: 0.963
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 416 weight vectors
  Based on 31 matches and 49 non-matches
  Classified 133 matches and 283 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.6125, 0.9631672450918832, 0.3875)
    (283, 0.6125, 0.9631672450918832, 0.3875)

Current size of match and non-match training data sets: 31 / 49

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 283 weight vectors
- Estimated match proportion 0.388

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 283 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [1.000, 0.000, 0.864, 0.667, 0.435, 0.700, 0.600] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.846, 0.857, 0.353, 0.318, 0.400] (False)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 4 matches and 65 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.319
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

65.0
Analisando o arquivo: diverg(20)62_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 62), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)62_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 91 matches and 857 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (857, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 857 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 857 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 18 matches and 52 non-matches
    Purity of oracle classification:  0.743
    Entropy of oracle classification: 0.822
    Number of true matches:      18
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)82_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 82), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)82_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 123 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)374_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 374), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)374_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 863
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 863 weight vectors
  Containing 195 true matches and 668 true non-matches
    (22.60% true matches)
  Identified 811 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   775  (95.56%)
          2 :    33  (4.07%)
          3 :     2  (0.25%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 811 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 163
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 647

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 862
  Number of unique weight vectors: 811

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (811, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 811 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 811 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 30 matches and 56 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 725 weight vectors
  Based on 30 matches and 56 non-matches
  Classified 153 matches and 572 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)
    (572, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)

Current size of match and non-match training data sets: 30 / 56

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 572 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 572 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.400, 0.583, 0.563] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.462, 0.889, 0.455, 0.211, 0.375] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)120_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 120), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)120_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 159 matches and 780 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (780, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 159 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 159 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)785_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (15, 1 - acm diverg, 785), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)785_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 697
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 697 weight vectors
  Containing 169 true matches and 528 true non-matches
    (24.25% true matches)
  Identified 678 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   665  (98.08%)
          2 :    10  (1.47%)
          3 :     2  (0.29%)
          6 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 678 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.000 : 526

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 697
  Number of unique weight vectors: 678

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (678, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 678 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 678 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 594 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 111 matches and 483 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (111, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (483, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 483 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 483 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(15)16_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 16), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)16_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 869
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 869 weight vectors
  Containing 190 true matches and 679 true non-matches
    (21.86% true matches)
  Identified 829 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   795  (95.90%)
          2 :    31  (3.74%)
          3 :     2  (0.24%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 829 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.000 : 659

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 869
  Number of unique weight vectors: 829

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (829, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 829 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 829 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 27 matches and 59 non-matches
    Purity of oracle classification:  0.686
    Entropy of oracle classification: 0.898
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 743 weight vectors
  Based on 27 matches and 59 non-matches
  Classified 126 matches and 617 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (126, 0.686046511627907, 0.8976844934141643, 0.313953488372093)
    (617, 0.686046511627907, 0.8976844934141643, 0.313953488372093)

Current size of match and non-match training data sets: 27 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 126 weight vectors
- Estimated match proportion 0.314

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 126 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 49 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.141
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(10)972_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 972), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)972_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 870
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 870 weight vectors
  Containing 175 true matches and 695 true non-matches
    (20.11% true matches)
  Identified 831 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   801  (96.39%)
          2 :    27  (3.25%)
          3 :     2  (0.24%)
          9 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 831 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 156
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 674

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 830

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (830, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 830 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 830 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 31 matches and 55 non-matches
    Purity of oracle classification:  0.640
    Entropy of oracle classification: 0.943
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 744 weight vectors
  Based on 31 matches and 55 non-matches
  Classified 175 matches and 569 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (175, 0.6395348837209303, 0.9430685934712908, 0.36046511627906974)
    (569, 0.6395348837209303, 0.9430685934712908, 0.36046511627906974)

Current size of match and non-match training data sets: 31 / 55

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 175 weight vectors
- Estimated match proportion 0.360

Sample size for this cluster: 59

Farthest first selection of 59 weight vectors from 175 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 59 weight vectors
  The oracle will correctly classify 59 weight vectors and wrongly classify 0
  Classified 39 matches and 20 non-matches
    Purity of oracle classification:  0.661
    Entropy of oracle classification: 0.924
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  20
    Number of false non-matches: 0

Deleted 59 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(20)734_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 734), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)734_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 793 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)737_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 737), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)737_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 644
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 644 weight vectors
  Containing 212 true matches and 432 true non-matches
    (32.92% true matches)
  Identified 608 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   591  (97.20%)
          2 :    14  (2.30%)
          3 :     2  (0.33%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 608 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 429

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 643
  Number of unique weight vectors: 608

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (608, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 608 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 608 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 525 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 152 matches and 373 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (373, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 152 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 152 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 49 matches and 6 non-matches
    Purity of oracle classification:  0.891
    Entropy of oracle classification: 0.497
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)749_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 749), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)749_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 732
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 732 weight vectors
  Containing 219 true matches and 513 true non-matches
    (29.92% true matches)
  Identified 677 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   641  (94.68%)
          2 :    33  (4.87%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 677 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 492

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 731
  Number of unique weight vectors: 677

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (677, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 677 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 677 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 593 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 148 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (445, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 445 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 445 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 8 matches and 62 non-matches
    Purity of oracle classification:  0.886
    Entropy of oracle classification: 0.513
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)635_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 635), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)635_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 0 matches and 829 non-matches

40.0
Analisando o arquivo: diverg(20)199_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (20, 1 - acm diverg, 199), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)199_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1053
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1053 weight vectors
  Containing 187 true matches and 866 true non-matches
    (17.76% true matches)
  Identified 1011 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   980  (96.93%)
          2 :    28  (2.77%)
          3 :     2  (0.20%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1011 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1052
  Number of unique weight vectors: 1011

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1011, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1011 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1011 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 924 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 91 matches and 833 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (833, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 91 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 91 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 42 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)901_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 901), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)901_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1100
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1100 weight vectors
  Containing 227 true matches and 873 true non-matches
    (20.64% true matches)
  Identified 1043 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1006  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1043 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1099
  Number of unique weight vectors: 1043

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1043, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1043 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1043 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 955 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)476_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981481
recall                 0.177258
f-measure              0.300283
da                           54
dm                            0
ndm                           0
tp                           53
fp                            1
tn                  4.76529e+07
fn                          246
Name: (10, 1 - acm diverg, 476), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)476_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1030
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1030 weight vectors
  Containing 210 true matches and 820 true non-matches
    (20.39% true matches)
  Identified 976 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   941  (96.41%)
          2 :    32  (3.28%)
          3 :     2  (0.20%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 976 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 799

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1029
  Number of unique weight vectors: 976

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (976, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 976 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 976 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 32 matches and 55 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 889 weight vectors
  Based on 32 matches and 55 non-matches
  Classified 317 matches and 572 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (317, 0.632183908045977, 0.9489804585630242, 0.367816091954023)
    (572, 0.632183908045977, 0.9489804585630242, 0.367816091954023)

Current size of match and non-match training data sets: 32 / 55

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 572 weight vectors
- Estimated match proportion 0.368

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 572 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

54.0
Analisando o arquivo: diverg(15)318_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 318), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)318_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 907
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 907 weight vectors
  Containing 204 true matches and 703 true non-matches
    (22.49% true matches)
  Identified 858 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   824  (96.04%)
          2 :    31  (3.61%)
          3 :     2  (0.23%)
         15 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 858 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 682

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 906
  Number of unique weight vectors: 858

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (858, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 858 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 858 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 772 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 149 matches and 623 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (623, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 149 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)813_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 813), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)813_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 147 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (537, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 147 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 147 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 53 matches and 2 non-matches
    Purity of oracle classification:  0.964
    Entropy of oracle classification: 0.225
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)119_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 119), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)119_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 817 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 817 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 11 matches and 60 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.622
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)229_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 229), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)229_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 453
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 453 weight vectors
  Containing 218 true matches and 235 true non-matches
    (48.12% true matches)
  Identified 417 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   398  (95.44%)
          2 :    16  (3.84%)
          3 :     2  (0.48%)
         17 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 417 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 232

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 452
  Number of unique weight vectors: 417

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (417, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 417 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 417 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 37 matches and 41 non-matches
    Purity of oracle classification:  0.526
    Entropy of oracle classification: 0.998
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 339 weight vectors
  Based on 37 matches and 41 non-matches
  Classified 278 matches and 61 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (278, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)
    (61, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)

Current size of match and non-match training data sets: 37 / 41

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 61 weight vectors
- Estimated match proportion 0.474

Sample size for this cluster: 38

Farthest first selection of 38 weight vectors from 61 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 0.000, 0.818, 0.727, 0.438, 0.375, 0.400] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)

Perform oracle with 100.00 accuracy on 38 weight vectors
  The oracle will correctly classify 38 weight vectors and wrongly classify 0
  Classified 0 matches and 38 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 38 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)52_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 52), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)52_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 855
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 855 weight vectors
  Containing 221 true matches and 634 true non-matches
    (25.85% true matches)
  Identified 799 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   763  (95.49%)
          2 :    33  (4.13%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 799 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 854
  Number of unique weight vectors: 799

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (799, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 799 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 799 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 714 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (564, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 150 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)785_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 785), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)785_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1069
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1069 weight vectors
  Containing 221 true matches and 848 true non-matches
    (20.67% true matches)
  Identified 1013 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   977  (96.45%)
          2 :    33  (3.26%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1013 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1068
  Number of unique weight vectors: 1013

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1013, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1013 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1013 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 926 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 106 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)699_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 699), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)699_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 731
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 731 weight vectors
  Containing 210 true matches and 521 true non-matches
    (28.73% true matches)
  Identified 697 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   680  (97.56%)
          2 :    14  (2.01%)
          3 :     2  (0.29%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 697 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 518

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 730
  Number of unique weight vectors: 697

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (697, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 697 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 697 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 613 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 142 matches and 471 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (471, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 142 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)392_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 392), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)392_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(10)482_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.982143
recall                 0.183946
f-measure              0.309859
da                           56
dm                            0
ndm                           0
tp                           55
fp                            1
tn                  4.76529e+07
fn                          244
Name: (10, 1 - acm diverg, 482), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)482_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 435
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 435 weight vectors
  Containing 205 true matches and 230 true non-matches
    (47.13% true matches)
  Identified 403 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   389  (96.53%)
          2 :    11  (2.73%)
          3 :     2  (0.50%)
         18 :     1  (0.25%)

Identified 1 non-pure unique weight vectors (from 403 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 229

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 434
  Number of unique weight vectors: 403

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (403, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 403 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 403 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 36 matches and 41 non-matches
    Purity of oracle classification:  0.532
    Entropy of oracle classification: 0.997
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 326 weight vectors
  Based on 36 matches and 41 non-matches
  Classified 136 matches and 190 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.5324675324675324, 0.9969562518473083, 0.4675324675324675)
    (190, 0.5324675324675324, 0.9969562518473083, 0.4675324675324675)

Current size of match and non-match training data sets: 36 / 41

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 136 weight vectors
- Estimated match proportion 0.468

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

56.0
Analisando o arquivo: diverg(15)659_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 659), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)659_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1092
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1092 weight vectors
  Containing 221 true matches and 871 true non-matches
    (20.24% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1000  (96.53%)
          2 :    33  (3.19%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 850

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1091
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 845 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (845, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 845 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 845 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)349_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 349), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)349_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 332
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 332 weight vectors
  Containing 178 true matches and 154 true non-matches
    (53.61% true matches)
  Identified 300 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   285  (95.00%)
          2 :    12  (4.00%)
          3 :     2  (0.67%)
         17 :     1  (0.33%)

Identified 1 non-pure unique weight vectors (from 300 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 151

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 331
  Number of unique weight vectors: 300

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (300, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 300 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 73

Perform initial selection using "far" method

Farthest first selection of 73 weight vectors from 300 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 32 matches and 41 non-matches
    Purity of oracle classification:  0.562
    Entropy of oracle classification: 0.989
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 227 weight vectors
  Based on 32 matches and 41 non-matches
  Classified 145 matches and 82 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 73
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.5616438356164384, 0.9890076795739704, 0.4383561643835616)
    (82, 0.5616438356164384, 0.9890076795739704, 0.4383561643835616)

Current size of match and non-match training data sets: 32 / 41

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 82 weight vectors
- Estimated match proportion 0.438

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 82 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.456, 1.000, 0.087, 0.208, 0.125, 0.152, 0.061] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 1 matches and 43 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)503_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 503), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)503_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1027
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1027 weight vectors
  Containing 223 true matches and 804 true non-matches
    (21.71% true matches)
  Identified 973 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   936  (96.20%)
          2 :    34  (3.49%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 973 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 783

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 973

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (973, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 973 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 973 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 886 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 131 matches and 755 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (755, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 131 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 49 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.141
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)288_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 288), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)288_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 26 matches and 62 non-matches
    Purity of oracle classification:  0.705
    Entropy of oracle classification: 0.876
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 26 matches and 62 non-matches
  Classified 119 matches and 829 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (119, 0.7045454545454546, 0.8756633923230397, 0.29545454545454547)
    (829, 0.7045454545454546, 0.8756633923230397, 0.29545454545454547)

Current size of match and non-match training data sets: 26 / 62

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 119 weight vectors
- Estimated match proportion 0.295

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 119 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)463_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 463), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)463_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 664
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 664 weight vectors
  Containing 212 true matches and 452 true non-matches
    (31.93% true matches)
  Identified 612 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   576  (94.12%)
          2 :    33  (5.39%)
          3 :     2  (0.33%)
         16 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 612 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 431

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 663
  Number of unique weight vectors: 612

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (612, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 612 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 612 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 529 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 155 matches and 374 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (374, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 374 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 374 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.857, 0.444, 0.556, 0.235, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 2 matches and 68 non-matches
    Purity of oracle classification:  0.971
    Entropy of oracle classification: 0.187
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)383_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 383), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)383_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1069
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1069 weight vectors
  Containing 221 true matches and 848 true non-matches
    (20.67% true matches)
  Identified 1013 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   977  (96.45%)
          2 :    33  (3.26%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1013 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1068
  Number of unique weight vectors: 1013

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1013, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1013 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1013 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 926 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 106 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)873_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 873), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)873_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 751
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 751 weight vectors
  Containing 204 true matches and 547 true non-matches
    (27.16% true matches)
  Identified 713 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   695  (97.48%)
          2 :    15  (2.10%)
          3 :     2  (0.28%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 713 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 544

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 750
  Number of unique weight vectors: 713

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (713, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 713 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 713 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 32 matches and 52 non-matches
    Purity of oracle classification:  0.619
    Entropy of oracle classification: 0.959
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 629 weight vectors
  Based on 32 matches and 52 non-matches
  Classified 184 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (184, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)
    (445, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)

Current size of match and non-match training data sets: 32 / 52

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 184 weight vectors
- Estimated match proportion 0.381

Sample size for this cluster: 61

Farthest first selection of 61 weight vectors from 184 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.817, 1.000, 0.250, 0.212, 0.256, 0.045, 0.250] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.780, 1.000, 0.271, 0.152, 0.137, 0.250, 0.167] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)

Perform oracle with 100.00 accuracy on 61 weight vectors
  The oracle will correctly classify 61 weight vectors and wrongly classify 0
  Classified 48 matches and 13 non-matches
    Purity of oracle classification:  0.787
    Entropy of oracle classification: 0.747
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  13
    Number of false non-matches: 0

Deleted 61 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)273_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 273), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)273_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 999
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 999 weight vectors
  Containing 186 true matches and 813 true non-matches
    (18.62% true matches)
  Identified 957 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   926  (96.76%)
          2 :    28  (2.93%)
          3 :     2  (0.21%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 957 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 792

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 998
  Number of unique weight vectors: 957

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (957, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 957 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 957 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 870 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 301 matches and 569 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (301, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (569, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 301 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 301 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 41 matches and 27 non-matches
    Purity of oracle classification:  0.603
    Entropy of oracle classification: 0.969
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  27
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)651_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 651), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)651_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)170_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 170), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)170_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 793
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 793 weight vectors
  Containing 187 true matches and 606 true non-matches
    (23.58% true matches)
  Identified 751 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   720  (95.87%)
          2 :    28  (3.73%)
          3 :     2  (0.27%)
         11 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 751 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 585

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 792
  Number of unique weight vectors: 751

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (751, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 751 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 751 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 666 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 308 matches and 358 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (308, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (358, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 358 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 358 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.700, 0.429, 0.476, 0.647, 0.810] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.370, 0.818, 0.800, 0.550, 0.500] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.857, 0.875, 0.625, 0.333, 0.667] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 0 matches and 70 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)336_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (20, 1 - acm diverg, 336), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)336_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 908
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 908 weight vectors
  Containing 212 true matches and 696 true non-matches
    (23.35% true matches)
  Identified 856 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   820  (95.79%)
          2 :    33  (3.86%)
          3 :     2  (0.23%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 856 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 675

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 907
  Number of unique weight vectors: 856

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (856, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 856 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 856 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 770 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 165 matches and 605 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (165, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (605, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 165 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 165 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 46 matches and 11 non-matches
    Purity of oracle classification:  0.807
    Entropy of oracle classification: 0.708
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)363_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (15, 1 - acm diverg, 363), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)363_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 863
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 863 weight vectors
  Containing 160 true matches and 703 true non-matches
    (18.54% true matches)
  Identified 829 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   799  (96.38%)
          2 :    27  (3.26%)
          3 :     2  (0.24%)
          4 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 829 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 146
     0.000 : 683

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 863
  Number of unique weight vectors: 829

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (829, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 829 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 829 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 26 matches and 60 non-matches
    Purity of oracle classification:  0.698
    Entropy of oracle classification: 0.884
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 743 weight vectors
  Based on 26 matches and 60 non-matches
  Classified 94 matches and 649 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6976744186046512, 0.8841151220488478, 0.3023255813953488)
    (649, 0.6976744186046512, 0.8841151220488478, 0.3023255813953488)

Current size of match and non-match training data sets: 26 / 60

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 649 weight vectors
- Estimated match proportion 0.302

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 649 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 11 matches and 61 non-matches
    Purity of oracle classification:  0.847
    Entropy of oracle classification: 0.617
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(20)404_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 404), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)404_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1086
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1086 weight vectors
  Containing 220 true matches and 866 true non-matches
    (20.26% true matches)
  Identified 1030 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   994  (96.50%)
          2 :    33  (3.20%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1030 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1085
  Number of unique weight vectors: 1030

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1030, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1030 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1030 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 942 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 86 matches and 856 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (86, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (856, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 86 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 41

Farthest first selection of 41 weight vectors from 86 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.950, 0.923, 0.941] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 41 weight vectors
  The oracle will correctly classify 41 weight vectors and wrongly classify 0
  Classified 41 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 41 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)960_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 960), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)960_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 729
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 729 weight vectors
  Containing 210 true matches and 519 true non-matches
    (28.81% true matches)
  Identified 695 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   678  (97.55%)
          2 :    14  (2.01%)
          3 :     2  (0.29%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 695 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 516

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 728
  Number of unique weight vectors: 695

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (695, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 695 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 695 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 611 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 142 matches and 469 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (469, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 142 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 51 matches and 4 non-matches
    Purity of oracle classification:  0.927
    Entropy of oracle classification: 0.376
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)185_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 185), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)185_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1061
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1061 weight vectors
  Containing 225 true matches and 836 true non-matches
    (21.21% true matches)
  Identified 1004 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   967  (96.31%)
          2 :    34  (3.39%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1004 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 815

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1060
  Number of unique weight vectors: 1004

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1004, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1004 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1004 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 917 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 130 matches and 787 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (787, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 130 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.956, 1.000, 1.000, 1.000, 0.966, 1.000, 0.971] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)530_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (15, 1 - acm diverg, 530), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)530_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 927
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 927 weight vectors
  Containing 178 true matches and 749 true non-matches
    (19.20% true matches)
  Identified 888 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   858  (96.62%)
          2 :    27  (3.04%)
          3 :     2  (0.23%)
          9 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 888 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 728

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 918
  Number of unique weight vectors: 887

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (887, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 887 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 887 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 23 matches and 63 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.838
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 801 weight vectors
  Based on 23 matches and 63 non-matches
  Classified 89 matches and 712 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (89, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)
    (712, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)

Current size of match and non-match training data sets: 23 / 63

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.84
- Size 712 weight vectors
- Estimated match proportion 0.267

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 712 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(20)6_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 6), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)6_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)305_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 305), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)305_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 987
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 987 weight vectors
  Containing 212 true matches and 775 true non-matches
    (21.48% true matches)
  Identified 935 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   900  (96.26%)
          2 :    32  (3.42%)
          3 :     2  (0.21%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 935 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 754

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 986
  Number of unique weight vectors: 935

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (935, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 935 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 935 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 32 matches and 55 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 848 weight vectors
  Based on 32 matches and 55 non-matches
  Classified 293 matches and 555 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (293, 0.632183908045977, 0.9489804585630242, 0.367816091954023)
    (555, 0.632183908045977, 0.9489804585630242, 0.367816091954023)

Current size of match and non-match training data sets: 32 / 55

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 293 weight vectors
- Estimated match proportion 0.368

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 293 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 44 matches and 24 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  24
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)577_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 577), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)577_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 699
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 699 weight vectors
  Containing 219 true matches and 480 true non-matches
    (31.33% true matches)
  Identified 644 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   608  (94.41%)
          2 :    33  (5.12%)
          3 :     2  (0.31%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 644 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 459

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 698
  Number of unique weight vectors: 644

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (644, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 644 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 644 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 561 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 156 matches and 405 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (405, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 156 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 51 matches and 5 non-matches
    Purity of oracle classification:  0.911
    Entropy of oracle classification: 0.434
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)272_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 272), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)272_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)858_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990385
recall                 0.344482
f-measure              0.511166
da                          104
dm                            0
ndm                           0
tp                          103
fp                            1
tn                  4.76529e+07
fn                          196
Name: (10, 1 - acm diverg, 858), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)858_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 579
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 579 weight vectors
  Containing 149 true matches and 430 true non-matches
    (25.73% true matches)
  Identified 562 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   553  (98.40%)
          2 :     6  (1.07%)
          3 :     2  (0.36%)
          8 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 562 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 134
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 427

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 571
  Number of unique weight vectors: 561

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (561, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 561 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 561 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 479 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 103 matches and 376 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (376, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 376 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 376 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.600, 0.700, 0.600, 0.611, 0.706] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.826, 0.286, 0.857, 0.643] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 4 matches and 65 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.319
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

104.0
Analisando o arquivo: diverg(20)173_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 173), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)173_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1059
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1059 weight vectors
  Containing 227 true matches and 832 true non-matches
    (21.44% true matches)
  Identified 1002 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   965  (96.31%)
          2 :    34  (3.39%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1002 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 811

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1058
  Number of unique weight vectors: 1002

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1002, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1002 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1002 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 915 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 177 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (177, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (738, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 738 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)323_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 323), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)323_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)782_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (10, 1 - acm diverg, 782), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)782_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 291
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 291 weight vectors
  Containing 199 true matches and 92 true non-matches
    (68.38% true matches)
  Identified 259 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   245  (94.59%)
          2 :    11  (4.25%)
          3 :     2  (0.77%)
         18 :     1  (0.39%)

Identified 1 non-pure unique weight vectors (from 259 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 91

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 290
  Number of unique weight vectors: 259

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (259, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 259 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 70

Perform initial selection using "far" method

Farthest first selection of 70 weight vectors from 259 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 32 matches and 38 non-matches
    Purity of oracle classification:  0.543
    Entropy of oracle classification: 0.995
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 189 weight vectors
  Based on 32 matches and 38 non-matches
  Classified 137 matches and 52 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 70
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.5428571428571428, 0.9946937953613058, 0.45714285714285713)
    (52, 0.5428571428571428, 0.9946937953613058, 0.45714285714285713)

Current size of match and non-match training data sets: 32 / 38

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 0.99
- Size 137 weight vectors
- Estimated match proportion 0.457

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 137 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 52 matches and 4 non-matches
    Purity of oracle classification:  0.929
    Entropy of oracle classification: 0.371
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(10)392_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976923
recall                 0.424749
f-measure              0.592075
da                          130
dm                            0
ndm                           0
tp                          127
fp                            3
tn                  4.76529e+07
fn                          172
Name: (10, 1 - acm diverg, 392), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)392_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 540
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 540 weight vectors
  Containing 130 true matches and 410 true non-matches
    (24.07% true matches)
  Identified 509 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   481  (94.50%)
          2 :    25  (4.91%)
          3 :     3  (0.59%)

Identified 0 non-pure unique weight vectors (from 509 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 119
     0.000 : 390

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 540
  Number of unique weight vectors: 509

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (509, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 509 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 509 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 26 matches and 55 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.905
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 428 weight vectors
  Based on 26 matches and 55 non-matches
  Classified 116 matches and 312 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (116, 0.6790123456790124, 0.9054522631867894, 0.32098765432098764)
    (312, 0.6790123456790124, 0.9054522631867894, 0.32098765432098764)

Current size of match and non-match training data sets: 26 / 55

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 312 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 312 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.767, 0.545, 0.818, 0.714, 0.773] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.423, 0.478, 0.357, 0.615, 0.727] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [0.800, 0.000, 0.625, 0.571, 0.467, 0.474, 0.667] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.778, 0.500, 0.789, 0.750, 0.385] (False)
    [1.000, 0.000, 0.333, 0.600, 0.800, 0.778, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.833, 0.833, 0.550, 0.500, 0.688] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.875, 0.467, 0.471, 0.833, 0.571] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.857, 0.000, 0.500, 0.389, 0.235, 0.045, 0.526] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.600, 0.857, 0.579, 0.286, 0.545] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.000, 0.700, 0.818, 0.444, 0.619] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 0 matches and 66 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

130.0
Analisando o arquivo: diverg(10)97_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 97), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)97_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 216 true matches and 737 true non-matches
    (22.67% true matches)
  Identified 898 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   862  (95.99%)
          2 :    33  (3.67%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 898 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 716

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 898

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (898, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 898 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 898 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 812 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 112 matches and 700 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (700, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 112 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 44 matches and 2 non-matches
    Purity of oracle classification:  0.957
    Entropy of oracle classification: 0.258
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)611_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 611), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)611_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)68_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 68), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)68_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 123 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)849_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 849), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)849_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 617
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 617 weight vectors
  Containing 200 true matches and 417 true non-matches
    (32.41% true matches)
  Identified 568 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   534  (94.01%)
          2 :    31  (5.46%)
          3 :     2  (0.35%)
         15 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 568 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 396

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 616
  Number of unique weight vectors: 568

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (568, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 568 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 568 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 26 matches and 56 non-matches
    Purity of oracle classification:  0.683
    Entropy of oracle classification: 0.901
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 486 weight vectors
  Based on 26 matches and 56 non-matches
  Classified 141 matches and 345 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6829268292682927, 0.9011701959974223, 0.3170731707317073)
    (345, 0.6829268292682927, 0.9011701959974223, 0.3170731707317073)

Current size of match and non-match training data sets: 26 / 56

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 141 weight vectors
- Estimated match proportion 0.317

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)215_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 215), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)215_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 804
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 804 weight vectors
  Containing 226 true matches and 578 true non-matches
    (28.11% true matches)
  Identified 765 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   746  (97.52%)
          2 :    16  (2.09%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 765 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 575

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 803
  Number of unique weight vectors: 765

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (765, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 765 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 765 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 680 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 153 matches and 527 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (527, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 527 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 527 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 4 matches and 71 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.300
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)967_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976378
recall                 0.414716
f-measure               0.58216
da                          127
dm                            0
ndm                           0
tp                          124
fp                            3
tn                  4.76529e+07
fn                          175
Name: (10, 1 - acm diverg, 967), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)967_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 947
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 947 weight vectors
  Containing 141 true matches and 806 true non-matches
    (14.89% true matches)
  Identified 913 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   884  (96.82%)
          2 :    26  (2.85%)
          3 :     2  (0.22%)
          5 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 913 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 127
     0.000 : 786

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 947
  Number of unique weight vectors: 913

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (913, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 913 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 913 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 29 matches and 58 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 826 weight vectors
  Based on 29 matches and 58 non-matches
  Classified 246 matches and 580 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (246, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (580, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 29 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 246 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 63

Farthest first selection of 63 weight vectors from 246 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 63 weight vectors
  The oracle will correctly classify 63 weight vectors and wrongly classify 0
  Classified 37 matches and 26 non-matches
    Purity of oracle classification:  0.587
    Entropy of oracle classification: 0.978
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 63 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

127.0
Analisando o arquivo: diverg(20)294_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 294), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)294_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)215_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 215), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)215_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 146 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (538, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 146 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 50 matches and 4 non-matches
    Purity of oracle classification:  0.926
    Entropy of oracle classification: 0.381
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)449_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 449), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)449_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1038
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1038 weight vectors
  Containing 207 true matches and 831 true non-matches
    (19.94% true matches)
  Identified 991 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   956  (96.47%)
          2 :    32  (3.23%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 991 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 810

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1037
  Number of unique weight vectors: 991

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (991, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 991 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 991 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 904 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 307 matches and 597 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (307, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (597, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 597 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 597 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)758_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 758), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)758_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 812
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 812 weight vectors
  Containing 209 true matches and 603 true non-matches
    (25.74% true matches)
  Identified 765 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   730  (95.42%)
          2 :    32  (4.18%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 765 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 582

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 811
  Number of unique weight vectors: 765

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (765, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 765 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 765 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 680 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 126 matches and 554 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (126, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (554, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 554 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 554 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 16 matches and 55 non-matches
    Purity of oracle classification:  0.775
    Entropy of oracle classification: 0.770
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)693_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 693), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)693_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 603
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 603 weight vectors
  Containing 156 true matches and 447 true non-matches
    (25.87% true matches)
  Identified 569 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   539  (94.73%)
          2 :    27  (4.75%)
          3 :     2  (0.35%)
          4 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 569 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 142
     0.000 : 427

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 603
  Number of unique weight vectors: 569

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (569, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 569 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 569 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 487 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 144 matches and 343 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (343, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 144 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 144 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 39 matches and 15 non-matches
    Purity of oracle classification:  0.722
    Entropy of oracle classification: 0.852
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  15
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(15)428_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 428), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)428_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 695
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 695 weight vectors
  Containing 194 true matches and 501 true non-matches
    (27.91% true matches)
  Identified 671 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   654  (97.47%)
          2 :    14  (2.09%)
          3 :     2  (0.30%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 671 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.000 : 499

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 695
  Number of unique weight vectors: 671

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (671, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 671 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 671 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 34 matches and 50 non-matches
    Purity of oracle classification:  0.595
    Entropy of oracle classification: 0.974
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 587 weight vectors
  Based on 34 matches and 50 non-matches
  Classified 276 matches and 311 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (276, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)
    (311, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)

Current size of match and non-match training data sets: 34 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 276 weight vectors
- Estimated match proportion 0.405

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 276 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 44 matches and 25 non-matches
    Purity of oracle classification:  0.638
    Entropy of oracle classification: 0.945
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  25
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)172_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 172), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)172_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 765
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 765 weight vectors
  Containing 198 true matches and 567 true non-matches
    (25.88% true matches)
  Identified 723 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   688  (95.16%)
          2 :    32  (4.43%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 723 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 547

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 765
  Number of unique weight vectors: 723

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (723, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 723 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 723 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 638 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 144 matches and 494 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (494, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 144 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 144 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)42_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (15, 1 - acm diverg, 42), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)42_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 997
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 997 weight vectors
  Containing 170 true matches and 827 true non-matches
    (17.05% true matches)
  Identified 960 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   929  (96.77%)
          2 :    28  (2.92%)
          3 :     2  (0.21%)
          6 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 960 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 153
     0.000 : 807

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 997
  Number of unique weight vectors: 960

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (960, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 960 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 960 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 873 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 42 matches and 831 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (42, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (831, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 42 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 28

Farthest first selection of 28 weight vectors from 42 vectors
  The selected farthest weight vectors are:
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [0.971, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.833, 1.000, 1.000, 0.935] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)

Perform oracle with 100.00 accuracy on 28 weight vectors
  The oracle will correctly classify 28 weight vectors and wrongly classify 0
  Classified 28 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 28 weight vectors (classified by oracle) from cluster

Cluster is pure enough and not too large, add its 42 weight vectors to:
  Match training set

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 3: Queue length: 1
  Number of manual oracle classifications performed: 115
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (831, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 67 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 831 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 831 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 16 matches and 56 non-matches
    Purity of oracle classification:  0.778
    Entropy of oracle classification: 0.764
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(10)756_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 756), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)756_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 814
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 814 weight vectors
  Containing 220 true matches and 594 true non-matches
    (27.03% true matches)
  Identified 758 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   722  (95.25%)
          2 :    33  (4.35%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 758 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 573

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 813
  Number of unique weight vectors: 758

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (758, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 758 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 758 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 673 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 146 matches and 527 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (527, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 146 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)370_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 370), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)370_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 559
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 559 weight vectors
  Containing 187 true matches and 372 true non-matches
    (33.45% true matches)
  Identified 535 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   522  (97.57%)
          2 :    10  (1.87%)
          3 :     2  (0.37%)
         11 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 535 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 163
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 371

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 558
  Number of unique weight vectors: 535

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (535, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 535 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 535 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 28 matches and 53 non-matches
    Purity of oracle classification:  0.654
    Entropy of oracle classification: 0.930
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 454 weight vectors
  Based on 28 matches and 53 non-matches
  Classified 138 matches and 316 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.654320987654321, 0.9301497323974337, 0.345679012345679)
    (316, 0.654320987654321, 0.9301497323974337, 0.345679012345679)

Current size of match and non-match training data sets: 28 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 316 weight vectors
- Estimated match proportion 0.346

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 316 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.500, 0.826, 0.429, 0.538, 0.636] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.800, 0.684, 0.667, 0.529, 0.609] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.846, 0.542, 0.588, 0.579, 0.423] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 3 matches and 65 non-matches
    Purity of oracle classification:  0.956
    Entropy of oracle classification: 0.261
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)23_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 23), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)23_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 956
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 956 weight vectors
  Containing 205 true matches and 751 true non-matches
    (21.44% true matches)
  Identified 905 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   871  (96.24%)
          2 :    31  (3.43%)
          3 :     2  (0.22%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 905 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 730

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 955
  Number of unique weight vectors: 905

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (905, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 905 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 905 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 818 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 112 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)652_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 652), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)652_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 950 non-matches

46.0
Analisando o arquivo: diverg(10)748_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979592
recall                  0.32107
f-measure              0.483627
da                           98
dm                            0
ndm                           0
tp                           96
fp                            2
tn                  4.76529e+07
fn                          203
Name: (10, 1 - acm diverg, 748), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)748_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 415
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 415 weight vectors
  Containing 167 true matches and 248 true non-matches
    (40.24% true matches)
  Identified 396 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   383  (96.72%)
          2 :    10  (2.53%)
          3 :     2  (0.51%)
          6 :     1  (0.25%)

Identified 0 non-pure unique weight vectors (from 396 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 150
     0.000 : 246

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 415
  Number of unique weight vectors: 396

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (396, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 396 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 396 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 36 matches and 41 non-matches
    Purity of oracle classification:  0.532
    Entropy of oracle classification: 0.997
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 319 weight vectors
  Based on 36 matches and 41 non-matches
  Classified 254 matches and 65 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (254, 0.5324675324675324, 0.9969562518473083, 0.4675324675324675)
    (65, 0.5324675324675324, 0.9969562518473083, 0.4675324675324675)

Current size of match and non-match training data sets: 36 / 41

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 65 weight vectors
- Estimated match proportion 0.468

Sample size for this cluster: 39

Farthest first selection of 39 weight vectors from 65 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)

Perform oracle with 100.00 accuracy on 39 weight vectors
  The oracle will correctly classify 39 weight vectors and wrongly classify 0
  Classified 0 matches and 39 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 39 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

98.0
Analisando o arquivo: diverg(10)177_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 177), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)177_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 506
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 506 weight vectors
  Containing 187 true matches and 319 true non-matches
    (36.96% true matches)
  Identified 482 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   469  (97.30%)
          2 :    10  (2.07%)
          3 :     2  (0.41%)
         11 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 482 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 316

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 505
  Number of unique weight vectors: 482

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (482, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 482 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 482 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 34 matches and 46 non-matches
    Purity of oracle classification:  0.575
    Entropy of oracle classification: 0.984
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 402 weight vectors
  Based on 34 matches and 46 non-matches
  Classified 126 matches and 276 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (126, 0.575, 0.9837082626231857, 0.425)
    (276, 0.575, 0.9837082626231857, 0.425)

Current size of match and non-match training data sets: 34 / 46

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.98
- Size 126 weight vectors
- Estimated match proportion 0.425

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 126 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(10)388_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 388), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)388_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 584
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 584 weight vectors
  Containing 168 true matches and 416 true non-matches
    (28.77% true matches)
  Identified 564 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   553  (98.05%)
          2 :     8  (1.42%)
          3 :     2  (0.35%)
          9 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 564 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 150
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 413

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 575
  Number of unique weight vectors: 563

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (563, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 563 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 563 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.538, 0.789, 0.353, 0.545, 0.550] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 31 matches and 51 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 481 weight vectors
  Based on 31 matches and 51 non-matches
  Classified 125 matches and 356 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (125, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)
    (356, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)

Current size of match and non-match training data sets: 31 / 51

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 125 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 125 vectors
  The selected farthest weight vectors are:
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 46 matches and 7 non-matches
    Purity of oracle classification:  0.868
    Entropy of oracle classification: 0.563
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(20)996_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 996), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)996_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1027
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1027 weight vectors
  Containing 223 true matches and 804 true non-matches
    (21.71% true matches)
  Identified 973 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   936  (96.20%)
          2 :    34  (3.49%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 973 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 783

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 973

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (973, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 973 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 973 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 886 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 131 matches and 755 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (755, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 755 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 755 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)429_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 429), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)429_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 444
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 444 weight vectors
  Containing 209 true matches and 235 true non-matches
    (47.07% true matches)
  Identified 410 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   393  (95.85%)
          2 :    14  (3.41%)
          3 :     2  (0.49%)
         17 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 410 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 232

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 443
  Number of unique weight vectors: 410

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (410, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 410 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 410 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.833, 0.550, 0.500, 0.688] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 39 matches and 39 non-matches
    Purity of oracle classification:  0.500
    Entropy of oracle classification: 1.000
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 332 weight vectors
  Based on 39 matches and 39 non-matches
  Classified 272 matches and 60 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (272, 0.5, 1.0, 0.5)
    (60, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 39 / 39

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 272 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 272 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.913, 1.000, 0.184, 0.175, 0.087, 0.233, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 43 matches and 28 non-matches
    Purity of oracle classification:  0.606
    Entropy of oracle classification: 0.968
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)211_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 211), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)211_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 765
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 765 weight vectors
  Containing 198 true matches and 567 true non-matches
    (25.88% true matches)
  Identified 723 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   688  (95.16%)
          2 :    32  (4.43%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 723 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 547

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 765
  Number of unique weight vectors: 723

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (723, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 723 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 723 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 638 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 144 matches and 494 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (494, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 494 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 494 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 3 matches and 68 non-matches
    Purity of oracle classification:  0.958
    Entropy of oracle classification: 0.253
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)728_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 728), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)728_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 208 true matches and 867 true non-matches
    (19.35% true matches)
  Identified 1028 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   993  (96.60%)
          2 :    32  (3.11%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1028 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1028

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1028, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1028 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1028 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 940 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 123 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 123 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)568_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 568), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)568_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 872
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 872 weight vectors
  Containing 186 true matches and 686 true non-matches
    (21.33% true matches)
  Identified 832 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   798  (95.91%)
          2 :    31  (3.73%)
          3 :     2  (0.24%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 832 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.000 : 666

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 872
  Number of unique weight vectors: 832

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (832, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 832 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 832 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 746 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 148 matches and 598 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (598, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 598 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 598 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 0 matches and 74 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(15)905_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 905), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)905_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 925
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 925 weight vectors
  Containing 217 true matches and 708 true non-matches
    (23.46% true matches)
  Identified 870 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   834  (95.86%)
          2 :    33  (3.79%)
          3 :     2  (0.23%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 870 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 687

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 924
  Number of unique weight vectors: 870

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (870, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 870 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 870 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 784 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 165 matches and 619 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (165, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (619, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 619 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 619 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 1 matches and 73 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.103
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)513_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 513), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)513_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 854
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 854 weight vectors
  Containing 221 true matches and 633 true non-matches
    (25.88% true matches)
  Identified 798 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   762  (95.49%)
          2 :    33  (4.14%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 798 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 612

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 853
  Number of unique weight vectors: 798

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (798, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 798 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 798 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 713 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 563 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (563, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 563 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 563 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 5 matches and 69 non-matches
    Purity of oracle classification:  0.932
    Entropy of oracle classification: 0.357
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)847_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 847), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)847_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 780
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 780 weight vectors
  Containing 205 true matches and 575 true non-matches
    (26.28% true matches)
  Identified 733 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   698  (95.23%)
          2 :    32  (4.37%)
          3 :     2  (0.27%)
         12 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 733 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 554

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 779
  Number of unique weight vectors: 733

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (733, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 733 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 733 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 31 matches and 54 non-matches
    Purity of oracle classification:  0.635
    Entropy of oracle classification: 0.947
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 648 weight vectors
  Based on 31 matches and 54 non-matches
  Classified 321 matches and 327 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (321, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)
    (327, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)

Current size of match and non-match training data sets: 31 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.95
- Size 327 weight vectors
- Estimated match proportion 0.365

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 327 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.556, 0.348, 0.467, 0.636, 0.412] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.269, 0.478, 0.750, 0.385, 0.455] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.800, 0.667, 0.381, 0.550, 0.429] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.667, 0.286, 0.556, 0.259, 0.250] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 0 matches and 70 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)885_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 885), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)885_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 432
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 432 weight vectors
  Containing 184 true matches and 248 true non-matches
    (42.59% true matches)
  Identified 411 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   401  (97.57%)
          2 :     7  (1.70%)
          3 :     2  (0.49%)
         11 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 411 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 163
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 247

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 431
  Number of unique weight vectors: 411

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (411, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 411 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 411 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 33 matches and 45 non-matches
    Purity of oracle classification:  0.577
    Entropy of oracle classification: 0.983
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 333 weight vectors
  Based on 33 matches and 45 non-matches
  Classified 124 matches and 209 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (124, 0.5769230769230769, 0.9828586897127056, 0.4230769230769231)
    (209, 0.5769230769230769, 0.9828586897127056, 0.4230769230769231)

Current size of match and non-match training data sets: 33 / 45

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 209 weight vectors
- Estimated match proportion 0.423

Sample size for this cluster: 65

Farthest first selection of 65 weight vectors from 209 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 9 matches and 56 non-matches
    Purity of oracle classification:  0.862
    Entropy of oracle classification: 0.580
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(15)682_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 682), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)682_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1055
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1055 weight vectors
  Containing 214 true matches and 841 true non-matches
    (20.28% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 820

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1054
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 161 matches and 750 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (161, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (750, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 161 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 161 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 46 matches and 9 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.643
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)883_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 883), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)883_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 537
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 537 weight vectors
  Containing 224 true matches and 313 true non-matches
    (41.71% true matches)
  Identified 498 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   479  (96.18%)
          2 :    16  (3.21%)
          3 :     2  (0.40%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 498 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 310

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 536
  Number of unique weight vectors: 498

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (498, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 498 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 498 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 33 matches and 47 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.978
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 418 weight vectors
  Based on 33 matches and 47 non-matches
  Classified 151 matches and 267 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.5875, 0.9777945702913884, 0.4125)
    (267, 0.5875, 0.9777945702913884, 0.4125)

Current size of match and non-match training data sets: 33 / 47

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 267 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 267 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 6 matches and 63 non-matches
    Purity of oracle classification:  0.913
    Entropy of oracle classification: 0.426
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)230_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 230), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)230_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1058
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1058 weight vectors
  Containing 209 true matches and 849 true non-matches
    (19.75% true matches)
  Identified 1011 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   976  (96.54%)
          2 :    32  (3.17%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1011 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1057
  Number of unique weight vectors: 1011

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1011, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1011 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1011 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 924 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 104 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (104, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 104 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 104 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)297_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 297), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)297_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)211_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 211), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)211_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 855
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 855 weight vectors
  Containing 221 true matches and 634 true non-matches
    (25.85% true matches)
  Identified 799 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   763  (95.49%)
          2 :    33  (4.13%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 799 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 854
  Number of unique weight vectors: 799

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (799, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 799 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 799 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 714 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (564, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 564 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 564 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 5 matches and 69 non-matches
    Purity of oracle classification:  0.932
    Entropy of oracle classification: 0.357
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)360_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 360), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)360_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 131 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)161_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 161), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)161_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 731
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 731 weight vectors
  Containing 210 true matches and 521 true non-matches
    (28.73% true matches)
  Identified 698 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   684  (97.99%)
          2 :    11  (1.58%)
          3 :     2  (0.29%)
         19 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 698 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 520

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 730
  Number of unique weight vectors: 698

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (698, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 698 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 698 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 614 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 122 matches and 492 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (122, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (492, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 122 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 122 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 50 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.139
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)87_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 87), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)87_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 361
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 361 weight vectors
  Containing 203 true matches and 158 true non-matches
    (56.23% true matches)
  Identified 330 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   316  (95.76%)
          2 :    11  (3.33%)
          3 :     2  (0.61%)
         17 :     1  (0.30%)

Identified 1 non-pure unique weight vectors (from 330 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 157

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 360
  Number of unique weight vectors: 330

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (330, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 330 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 330 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 50 matches and 24 non-matches
    Purity of oracle classification:  0.676
    Entropy of oracle classification: 0.909
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  24
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 256 weight vectors
  Based on 50 matches and 24 non-matches
  Classified 256 matches and 0 non-matches

53.0
Analisando o arquivo: diverg(10)478_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 478), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)478_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 380
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 380 weight vectors
  Containing 216 true matches and 164 true non-matches
    (56.84% true matches)
  Identified 347 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   331  (95.39%)
          2 :    13  (3.75%)
          3 :     2  (0.58%)
         17 :     1  (0.29%)

Identified 1 non-pure unique weight vectors (from 347 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 163

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 379
  Number of unique weight vectors: 347

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (347, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 347 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 347 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 46 matches and 29 non-matches
    Purity of oracle classification:  0.613
    Entropy of oracle classification: 0.963
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  29
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 272 weight vectors
  Based on 46 matches and 29 non-matches
  Classified 272 matches and 0 non-matches

42.0
Analisando o arquivo: diverg(20)442_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (20, 1 - acm diverg, 442), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)442_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 724
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 724 weight vectors
  Containing 212 true matches and 512 true non-matches
    (29.28% true matches)
  Identified 671 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   636  (94.78%)
          2 :    32  (4.77%)
          3 :     2  (0.30%)
         18 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 671 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 491

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 723
  Number of unique weight vectors: 671

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (671, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 671 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 671 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 587 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 142 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (445, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 445 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 445 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 8 matches and 62 non-matches
    Purity of oracle classification:  0.886
    Entropy of oracle classification: 0.513
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(20)229_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 229), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)229_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.722, 0.471, 0.545, 0.579] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.556, 0.182, 0.500, 0.071, 0.400] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 0.963, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.344, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.300, 0.524, 0.727, 0.762] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 13 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (13, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (706, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 13 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 12

Farthest first selection of 12 weight vectors from 13 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.958, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.971, 0.952, 1.000] (True)
    [1.000, 1.000, 1.000, 0.952, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.980, 1.000] (True)
    [0.971, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.933, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 12 weight vectors
  The oracle will correctly classify 12 weight vectors and wrongly classify 0
  Classified 12 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 12 weight vectors (classified by oracle) from cluster

Cluster is pure enough and not too large, add its 13 weight vectors to:
  Match training set

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 3: Queue length: 1
  Number of manual oracle classifications performed: 98
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (706, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 37 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.462, 0.889, 0.455, 0.211, 0.375] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.440, 0.786, 0.545, 0.389, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.583, 0.444, 0.412, 0.318, 0.421] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 22 matches and 48 non-matches
    Purity of oracle classification:  0.686
    Entropy of oracle classification: 0.898
    Number of true matches:      22
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)636_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 636), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)636_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)558_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 558), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)558_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)601_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 601), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)601_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)335_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 335), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)335_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 839
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 839 weight vectors
  Containing 213 true matches and 626 true non-matches
    (25.39% true matches)
  Identified 785 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (95.54%)
          2 :    32  (4.08%)
          3 :     2  (0.25%)
         19 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 785 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 605

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 838
  Number of unique weight vectors: 785

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (785, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 785 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 785 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 700 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 137 matches and 563 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (563, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 137 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 137 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)426_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 426), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)426_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)231_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 231), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)231_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1069
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1069 weight vectors
  Containing 221 true matches and 848 true non-matches
    (20.67% true matches)
  Identified 1013 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   977  (96.45%)
          2 :    33  (3.26%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1013 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1068
  Number of unique weight vectors: 1013

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1013, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1013 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1013 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 926 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 142 matches and 784 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (784, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 142 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)322_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 322), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)322_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 714
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 714 weight vectors
  Containing 220 true matches and 494 true non-matches
    (30.81% true matches)
  Identified 678 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   662  (97.64%)
          2 :    13  (1.92%)
          3 :     2  (0.29%)
         20 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 678 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 493

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 713
  Number of unique weight vectors: 678

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (678, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 678 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 678 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 594 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 320 matches and 274 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (320, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (274, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 274 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 274 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.462, 0.667, 0.600, 0.389, 0.615] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.571, 0.867, 0.471, 0.583, 0.643] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 0 matches and 67 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)985_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984375
recall                 0.210702
f-measure              0.347107
da                           64
dm                            0
ndm                           0
tp                           63
fp                            1
tn                  4.76529e+07
fn                          236
Name: (10, 1 - acm diverg, 985), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)985_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 998
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 998 weight vectors
  Containing 199 true matches and 799 true non-matches
    (19.94% true matches)
  Identified 948 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   914  (96.41%)
          2 :    31  (3.27%)
          3 :     2  (0.21%)
         16 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 948 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 778

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 997
  Number of unique weight vectors: 948

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (948, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 948 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 948 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 32 matches and 55 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 861 weight vectors
  Based on 32 matches and 55 non-matches
  Classified 282 matches and 579 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (282, 0.632183908045977, 0.9489804585630242, 0.367816091954023)
    (579, 0.632183908045977, 0.9489804585630242, 0.367816091954023)

Current size of match and non-match training data sets: 32 / 55

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 282 weight vectors
- Estimated match proportion 0.368

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 282 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 46 matches and 22 non-matches
    Purity of oracle classification:  0.676
    Entropy of oracle classification: 0.908
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  22
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

64.0
Analisando o arquivo: diverg(10)355_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (10, 1 - acm diverg, 355), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)355_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 675
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 675 weight vectors
  Containing 161 true matches and 514 true non-matches
    (23.85% true matches)
  Identified 657 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   649  (98.78%)
          2 :     5  (0.76%)
          3 :     2  (0.30%)
         10 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 657 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 143
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 513

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 674
  Number of unique weight vectors: 657

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (657, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 657 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 657 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 573 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 80 matches and 493 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (80, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (493, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 493 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 493 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.462, 0.609, 0.643, 0.706, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 17 matches and 54 non-matches
    Purity of oracle classification:  0.761
    Entropy of oracle classification: 0.794
    Number of true matches:      17
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(10)817_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 817), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)817_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 408
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 408 weight vectors
  Containing 180 true matches and 228 true non-matches
    (44.12% true matches)
  Identified 387 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   377  (97.42%)
          2 :     7  (1.81%)
          3 :     2  (0.52%)
         11 :     1  (0.26%)

Identified 1 non-pure unique weight vectors (from 387 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 227

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 407
  Number of unique weight vectors: 387

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (387, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 387 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 387 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 36 matches and 41 non-matches
    Purity of oracle classification:  0.532
    Entropy of oracle classification: 0.997
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 310 weight vectors
  Based on 36 matches and 41 non-matches
  Classified 117 matches and 193 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (117, 0.5324675324675324, 0.9969562518473083, 0.4675324675324675)
    (193, 0.5324675324675324, 0.9969562518473083, 0.4675324675324675)

Current size of match and non-match training data sets: 36 / 41

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 117 weight vectors
- Estimated match proportion 0.468

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 117 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 47 matches and 6 non-matches
    Purity of oracle classification:  0.887
    Entropy of oracle classification: 0.510
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(10)986_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 986), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)986_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 687
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 687 weight vectors
  Containing 191 true matches and 496 true non-matches
    (27.80% true matches)
  Identified 663 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   646  (97.44%)
          2 :    14  (2.11%)
          3 :     2  (0.30%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 663 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.000 : 494

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 687
  Number of unique weight vectors: 663

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (663, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 663 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 663 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 34 matches and 50 non-matches
    Purity of oracle classification:  0.595
    Entropy of oracle classification: 0.974
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 579 weight vectors
  Based on 34 matches and 50 non-matches
  Classified 272 matches and 307 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (272, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)
    (307, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)

Current size of match and non-match training data sets: 34 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 272 weight vectors
- Estimated match proportion 0.405

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 272 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 44 matches and 25 non-matches
    Purity of oracle classification:  0.638
    Entropy of oracle classification: 0.945
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  25
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(20)293_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (20, 1 - acm diverg, 293), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)293_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 201 true matches and 752 true non-matches
    (21.09% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.26%)
          2 :    31  (3.41%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 119 matches and 702 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (119, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (702, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 702 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 702 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 13 matches and 59 non-matches
    Purity of oracle classification:  0.819
    Entropy of oracle classification: 0.681
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)780_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 780), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)780_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 754
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 754 weight vectors
  Containing 222 true matches and 532 true non-matches
    (29.44% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   699  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 529

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 753
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 135 matches and 499 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (499, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 499 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 499 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 13 matches and 60 non-matches
    Purity of oracle classification:  0.822
    Entropy of oracle classification: 0.676
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)739_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 739), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)739_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 814
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 814 weight vectors
  Containing 220 true matches and 594 true non-matches
    (27.03% true matches)
  Identified 758 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   722  (95.25%)
          2 :    33  (4.35%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 758 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 573

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 813
  Number of unique weight vectors: 758

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (758, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 758 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 758 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 673 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 146 matches and 527 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (527, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 527 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 527 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 5 matches and 67 non-matches
    Purity of oracle classification:  0.931
    Entropy of oracle classification: 0.364
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)281_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 281), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)281_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 711
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 711 weight vectors
  Containing 203 true matches and 508 true non-matches
    (28.55% true matches)
  Identified 685 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   671  (97.96%)
          2 :    11  (1.61%)
          3 :     2  (0.29%)
         12 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 685 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 507

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 710
  Number of unique weight vectors: 685

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (685, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 685 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 685 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 601 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 137 matches and 464 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (464, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 137 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 137 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 49 matches and 4 non-matches
    Purity of oracle classification:  0.925
    Entropy of oracle classification: 0.386
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)504_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 504), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)504_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 790
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 790 weight vectors
  Containing 212 true matches and 578 true non-matches
    (26.84% true matches)
  Identified 738 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   703  (95.26%)
          2 :    32  (4.34%)
          3 :     2  (0.27%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 738 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 557

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 789
  Number of unique weight vectors: 738

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (738, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 738 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 653 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 139 matches and 514 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (139, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (514, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 514 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 514 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 8 matches and 64 non-matches
    Purity of oracle classification:  0.889
    Entropy of oracle classification: 0.503
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)25_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 25), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)25_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1037
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1037 weight vectors
  Containing 221 true matches and 816 true non-matches
    (21.31% true matches)
  Identified 983 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   946  (96.24%)
          2 :    34  (3.46%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 983 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 795

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1036
  Number of unique weight vectors: 983

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (983, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 983 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 983 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 896 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 137 matches and 759 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (759, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 137 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 137 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 50 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.139
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)772_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 772), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)772_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)337_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 337), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)337_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)826_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 826), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)826_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 131 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)331_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 331), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)331_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1100
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1100 weight vectors
  Containing 227 true matches and 873 true non-matches
    (20.64% true matches)
  Identified 1043 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1006  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1043 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1099
  Number of unique weight vectors: 1043

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1043, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1043 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1043 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 955 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 955 non-matches

39.0
Analisando o arquivo: diverg(10)837_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 837), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)837_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 761
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 761 weight vectors
  Containing 187 true matches and 574 true non-matches
    (24.57% true matches)
  Identified 719 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   688  (95.69%)
          2 :    28  (3.89%)
          3 :     2  (0.28%)
         11 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 719 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 553

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 760
  Number of unique weight vectors: 719

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (719, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 719 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 719 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 635 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 308 matches and 327 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (308, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (327, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 308 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 308 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 41 matches and 28 non-matches
    Purity of oracle classification:  0.594
    Entropy of oracle classification: 0.974
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)664_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 664), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)664_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 208 true matches and 867 true non-matches
    (19.35% true matches)
  Identified 1028 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   993  (96.60%)
          2 :    32  (3.11%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1028 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1028

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1028, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1028 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1028 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 940 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 121 matches and 819 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (121, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (819, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 121 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 121 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 46 matches and 2 non-matches
    Purity of oracle classification:  0.958
    Entropy of oracle classification: 0.250
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)178_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 178), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)178_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 829
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 829 weight vectors
  Containing 227 true matches and 602 true non-matches
    (27.38% true matches)
  Identified 772 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   735  (95.21%)
          2 :    34  (4.40%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 772 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 581

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 828
  Number of unique weight vectors: 772

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (772, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 772 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 772 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 687 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (537, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 537 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 537 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 9 matches and 64 non-matches
    Purity of oracle classification:  0.877
    Entropy of oracle classification: 0.539
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)152_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 152), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)152_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 683
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 683 weight vectors
  Containing 201 true matches and 482 true non-matches
    (29.43% true matches)
  Identified 638 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   604  (94.67%)
          2 :    31  (4.86%)
          3 :     2  (0.31%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 638 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 461

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 682
  Number of unique weight vectors: 638

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (638, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 638 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 638 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 26 matches and 57 non-matches
    Purity of oracle classification:  0.687
    Entropy of oracle classification: 0.897
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 555 weight vectors
  Based on 26 matches and 57 non-matches
  Classified 129 matches and 426 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (129, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)
    (426, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)

Current size of match and non-match training data sets: 26 / 57

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 129 weight vectors
- Estimated match proportion 0.313

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 129 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 49 matches and 2 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.239
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)260_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 260), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)260_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)656_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (15, 1 - acm diverg, 656), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)656_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1005
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1005 weight vectors
  Containing 195 true matches and 810 true non-matches
    (19.40% true matches)
  Identified 963 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   928  (96.37%)
          2 :    32  (3.32%)
          3 :     2  (0.21%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 963 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 790

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1005
  Number of unique weight vectors: 963

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (963, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 963 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 963 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 876 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 138 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (738, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 138 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 138 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 48 matches and 4 non-matches
    Purity of oracle classification:  0.923
    Entropy of oracle classification: 0.391
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(20)570_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 570), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)570_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)461_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 461), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)461_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)572_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 572), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)572_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1092
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1092 weight vectors
  Containing 226 true matches and 866 true non-matches
    (20.70% true matches)
  Identified 1035 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   998  (96.43%)
          2 :    34  (3.29%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1035 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1091
  Number of unique weight vectors: 1035

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1035, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1035 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1035 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 947 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 91 matches and 856 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (856, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 856 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 856 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 18 matches and 52 non-matches
    Purity of oracle classification:  0.743
    Entropy of oracle classification: 0.822
    Number of true matches:      18
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)3_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 3), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)3_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 655
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 655 weight vectors
  Containing 213 true matches and 442 true non-matches
    (32.52% true matches)
  Identified 618 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   600  (97.09%)
          2 :    15  (2.43%)
          3 :     2  (0.32%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 618 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 439

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 654
  Number of unique weight vectors: 618

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (618, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 618 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 618 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 535 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 152 matches and 383 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (383, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 383 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 383 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 4 matches and 67 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.313
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)449_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 449), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)449_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 548
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 548 weight vectors
  Containing 226 true matches and 322 true non-matches
    (41.24% true matches)
  Identified 509 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   490  (96.27%)
          2 :    16  (3.14%)
          3 :     2  (0.39%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 509 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 319

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 547
  Number of unique weight vectors: 509

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (509, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 509 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 509 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 33 matches and 48 non-matches
    Purity of oracle classification:  0.593
    Entropy of oracle classification: 0.975
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 428 weight vectors
  Based on 33 matches and 48 non-matches
  Classified 152 matches and 276 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)
    (276, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)

Current size of match and non-match training data sets: 33 / 48

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 152 weight vectors
- Estimated match proportion 0.407

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 152 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 53 matches and 5 non-matches
    Purity of oracle classification:  0.914
    Entropy of oracle classification: 0.424
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)598_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 598), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)598_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)2_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 2), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)2_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)716_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.980198
recall                 0.331104
f-measure                 0.495
da                          101
dm                            0
ndm                           0
tp                           99
fp                            2
tn                  4.76529e+07
fn                          200
Name: (10, 1 - acm diverg, 716), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)716_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 186
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 186 weight vectors
  Containing 149 true matches and 37 true non-matches
    (80.11% true matches)
  Identified 174 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   166  (95.40%)
          2 :     5  (2.87%)
          3 :     2  (1.15%)
          4 :     1  (0.57%)

Identified 0 non-pure unique weight vectors (from 174 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 137
     0.000 : 37

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 186
  Number of unique weight vectors: 174

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (174, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 174 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 62

Perform initial selection using "far" method

Farthest first selection of 62 weight vectors from 174 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 62 weight vectors
  The oracle will correctly classify 62 weight vectors and wrongly classify 0
  Classified 36 matches and 26 non-matches
    Purity of oracle classification:  0.581
    Entropy of oracle classification: 0.981
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 62 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 112 weight vectors
  Based on 36 matches and 26 non-matches
  Classified 112 matches and 0 non-matches

101.0
Analisando o arquivo: diverg(10)48_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990566
recall                 0.351171
f-measure              0.518519
da                          106
dm                            0
ndm                           0
tp                          105
fp                            1
tn                  4.76529e+07
fn                          194
Name: (10, 1 - acm diverg, 48), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)48_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 762
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 762 weight vectors
  Containing 161 true matches and 601 true non-matches
    (21.13% true matches)
  Identified 723 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   694  (95.99%)
          2 :    26  (3.60%)
          3 :     2  (0.28%)
         10 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 723 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 142
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 580

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 761
  Number of unique weight vectors: 723

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (723, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 723 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 723 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 638 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 85 matches and 553 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (85, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (553, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 85 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 85 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 1.000, 1.000, 0.971, 0.952, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 41 matches and 1 non-matches
    Purity of oracle classification:  0.976
    Entropy of oracle classification: 0.162
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

106.0
Analisando o arquivo: diverg(10)14_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 14), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)14_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 698
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 698 weight vectors
  Containing 198 true matches and 500 true non-matches
    (28.37% true matches)
  Identified 653 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   619  (94.79%)
          2 :    31  (4.75%)
          3 :     2  (0.31%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 653 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 479

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 697
  Number of unique weight vectors: 653

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (653, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 653 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 653 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 25 matches and 58 non-matches
    Purity of oracle classification:  0.699
    Entropy of oracle classification: 0.883
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 570 weight vectors
  Based on 25 matches and 58 non-matches
  Classified 143 matches and 427 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6987951807228916, 0.8827586787955115, 0.30120481927710846)
    (427, 0.6987951807228916, 0.8827586787955115, 0.30120481927710846)

Current size of match and non-match training data sets: 25 / 58

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 143 weight vectors
- Estimated match proportion 0.301

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 143 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)202_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 202), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)202_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)4_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 4), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)4_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1041
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1041 weight vectors
  Containing 213 true matches and 828 true non-matches
    (20.46% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   954  (96.46%)
          2 :    32  (3.24%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 807

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1040
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 109 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 109 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)962_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 962), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)962_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 208
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 208 weight vectors
  Containing 180 true matches and 28 true non-matches
    (86.54% true matches)
  Identified 190 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   178  (93.68%)
          2 :     9  (4.74%)
          3 :     2  (1.05%)
          6 :     1  (0.53%)

Identified 0 non-pure unique weight vectors (from 190 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.000 : 28

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 208
  Number of unique weight vectors: 190

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (190, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 190 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 64

Perform initial selection using "far" method

Farthest first selection of 64 weight vectors from 190 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 42 matches and 22 non-matches
    Purity of oracle classification:  0.656
    Entropy of oracle classification: 0.928
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  22
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 126 weight vectors
  Based on 42 matches and 22 non-matches
  Classified 126 matches and 0 non-matches

69.0
Analisando o arquivo: diverg(10)753_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 753), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)753_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 659
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 659 weight vectors
  Containing 213 true matches and 446 true non-matches
    (32.32% true matches)
  Identified 607 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   571  (94.07%)
          2 :    33  (5.44%)
          3 :     2  (0.33%)
         16 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 607 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 425

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 658
  Number of unique weight vectors: 607

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (607, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 607 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 607 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 524 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 179 matches and 345 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (345, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 179 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 59

Farthest first selection of 59 weight vectors from 179 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.942, 1.000, 0.156, 0.172, 0.189, 0.148, 0.133] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 59 weight vectors
  The oracle will correctly classify 59 weight vectors and wrongly classify 0
  Classified 45 matches and 14 non-matches
    Purity of oracle classification:  0.763
    Entropy of oracle classification: 0.791
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  14
    Number of false non-matches: 0

Deleted 59 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)774_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 774), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)774_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1031
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1031 weight vectors
  Containing 187 true matches and 844 true non-matches
    (18.14% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   958  (96.87%)
          2 :    28  (2.83%)
          3 :     2  (0.20%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 823

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1030
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 308 matches and 594 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (308, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (594, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 308 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 308 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 40 matches and 28 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.977
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(10)371_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 371), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)371_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 275
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 275 weight vectors
  Containing 178 true matches and 97 true non-matches
    (64.73% true matches)
  Identified 254 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   239  (94.09%)
          2 :    12  (4.72%)
          3 :     2  (0.79%)
          6 :     1  (0.39%)

Identified 0 non-pure unique weight vectors (from 254 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.000 : 95

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 275
  Number of unique weight vectors: 254

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (254, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 254 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 70

Perform initial selection using "far" method

Farthest first selection of 70 weight vectors from 254 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 36 matches and 34 non-matches
    Purity of oracle classification:  0.514
    Entropy of oracle classification: 0.999
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  34
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 184 weight vectors
  Based on 36 matches and 34 non-matches
  Classified 128 matches and 56 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 70
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (128, 0.5142857142857142, 0.9994110647387553, 0.5142857142857142)
    (56, 0.5142857142857142, 0.9994110647387553, 0.5142857142857142)

Current size of match and non-match training data sets: 36 / 34

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 128 weight vectors
- Estimated match proportion 0.514

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 128 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 51 matches and 4 non-matches
    Purity of oracle classification:  0.927
    Entropy of oracle classification: 0.376
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)864_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 864), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)864_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 156 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (800, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 156 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)758_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 758), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)758_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)201_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 201), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)201_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 681
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 681 weight vectors
  Containing 187 true matches and 494 true non-matches
    (27.46% true matches)
  Identified 649 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   633  (97.53%)
          2 :    13  (2.00%)
          3 :     2  (0.31%)
         16 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 649 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 157
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 491

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 680
  Number of unique weight vectors: 649

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (649, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 649 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 649 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 35 matches and 48 non-matches
    Purity of oracle classification:  0.578
    Entropy of oracle classification: 0.982
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 566 weight vectors
  Based on 35 matches and 48 non-matches
  Classified 259 matches and 307 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (259, 0.5783132530120482, 0.9822309298084992, 0.42168674698795183)
    (307, 0.5783132530120482, 0.9822309298084992, 0.42168674698795183)

Current size of match and non-match training data sets: 35 / 48

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 307 weight vectors
- Estimated match proportion 0.422

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 307 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.462, 0.667, 0.600, 0.389, 0.615] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 0 matches and 72 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(15)735_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 735), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)735_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 528
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 528 weight vectors
  Containing 224 true matches and 304 true non-matches
    (42.42% true matches)
  Identified 489 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   470  (96.11%)
          2 :    16  (3.27%)
          3 :     2  (0.41%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 489 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 301

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 527
  Number of unique weight vectors: 489

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (489, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 489 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 489 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 36 matches and 44 non-matches
    Purity of oracle classification:  0.550
    Entropy of oracle classification: 0.993
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 409 weight vectors
  Based on 36 matches and 44 non-matches
  Classified 208 matches and 201 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (208, 0.55, 0.9927744539878084, 0.45)
    (201, 0.55, 0.9927744539878084, 0.45)

Current size of match and non-match training data sets: 36 / 44

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 201 weight vectors
- Estimated match proportion 0.450

Sample size for this cluster: 65

Farthest first selection of 65 weight vectors from 201 vectors
  The selected farthest weight vectors are:
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.818, 0.727, 0.438, 0.375, 0.400] (False)
    [1.000, 0.000, 0.800, 0.636, 0.563, 0.545, 0.722] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 1 matches and 64 non-matches
    Purity of oracle classification:  0.985
    Entropy of oracle classification: 0.115
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)739_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 739), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)739_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1092
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1092 weight vectors
  Containing 226 true matches and 866 true non-matches
    (20.70% true matches)
  Identified 1035 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   998  (96.43%)
          2 :    34  (3.29%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1035 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1091
  Number of unique weight vectors: 1035

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1035, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1035 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1035 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 27 matches and 61 non-matches
    Purity of oracle classification:  0.693
    Entropy of oracle classification: 0.889
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 947 weight vectors
  Based on 27 matches and 61 non-matches
  Classified 148 matches and 799 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6931818181818182, 0.8894663896628687, 0.3068181818181818)
    (799, 0.6931818181818182, 0.8894663896628687, 0.3068181818181818)

Current size of match and non-match training data sets: 27 / 61

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 148 weight vectors
- Estimated match proportion 0.307

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)408_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 408), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)408_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 701
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 701 weight vectors
  Containing 219 true matches and 482 true non-matches
    (31.24% true matches)
  Identified 646 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   610  (94.43%)
          2 :    33  (5.11%)
          3 :     2  (0.31%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 646 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 461

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 700
  Number of unique weight vectors: 646

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (646, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 646 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 646 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 563 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 157 matches and 406 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (406, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 157 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 157 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 51 matches and 5 non-matches
    Purity of oracle classification:  0.911
    Entropy of oracle classification: 0.434
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)278_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 278), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)278_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 730
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 730 weight vectors
  Containing 220 true matches and 510 true non-matches
    (30.14% true matches)
  Identified 694 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   678  (97.69%)
          2 :    13  (1.87%)
          3 :     2  (0.29%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 694 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 509

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 729
  Number of unique weight vectors: 694

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (694, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 694 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 694 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 610 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 143 matches and 467 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (467, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 143 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 143 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 50 matches and 4 non-matches
    Purity of oracle classification:  0.926
    Entropy of oracle classification: 0.381
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)707_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 707), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)707_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 797
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 797 weight vectors
  Containing 220 true matches and 577 true non-matches
    (27.60% true matches)
  Identified 759 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   741  (97.63%)
          2 :    15  (1.98%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 759 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 574

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 796
  Number of unique weight vectors: 759

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (759, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 759 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 759 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 674 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 143 matches and 531 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (531, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 531 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 531 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.273, 0.667, 0.643, 0.700, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.826, 0.429, 0.538, 0.636] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 6 matches and 69 non-matches
    Purity of oracle classification:  0.920
    Entropy of oracle classification: 0.402
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)441_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (15, 1 - acm diverg, 441), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)441_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 714
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 714 weight vectors
  Containing 201 true matches and 513 true non-matches
    (28.15% true matches)
  Identified 682 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   666  (97.65%)
          2 :    13  (1.91%)
          3 :     2  (0.29%)
         16 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 682 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 510

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 713
  Number of unique weight vectors: 682

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (682, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 682 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 682 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 33 matches and 51 non-matches
    Purity of oracle classification:  0.607
    Entropy of oracle classification: 0.967
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 598 weight vectors
  Based on 33 matches and 51 non-matches
  Classified 153 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)
    (445, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)

Current size of match and non-match training data sets: 33 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.97
- Size 445 weight vectors
- Estimated match proportion 0.393

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 445 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.667, 0.000, 0.800, 0.684, 0.667, 0.529, 0.609] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 2 matches and 74 non-matches
    Purity of oracle classification:  0.974
    Entropy of oracle classification: 0.176
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(15)567_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 567), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)567_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 548
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 548 weight vectors
  Containing 226 true matches and 322 true non-matches
    (41.24% true matches)
  Identified 509 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   490  (96.27%)
          2 :    16  (3.14%)
          3 :     2  (0.39%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 509 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 319

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 547
  Number of unique weight vectors: 509

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (509, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 509 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 509 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 33 matches and 48 non-matches
    Purity of oracle classification:  0.593
    Entropy of oracle classification: 0.975
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 428 weight vectors
  Based on 33 matches and 48 non-matches
  Classified 152 matches and 276 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)
    (276, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)

Current size of match and non-match training data sets: 33 / 48

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 276 weight vectors
- Estimated match proportion 0.407

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 276 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.800, 0.636, 0.563, 0.545, 0.722] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 5 matches and 64 non-matches
    Purity of oracle classification:  0.928
    Entropy of oracle classification: 0.375
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)134_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 134), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)134_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1100
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1100 weight vectors
  Containing 227 true matches and 873 true non-matches
    (20.64% true matches)
  Identified 1043 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1006  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1043 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1099
  Number of unique weight vectors: 1043

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1043, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1043 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1043 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 955 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)131_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.982759
recall                 0.190635
f-measure              0.319328
da                           58
dm                            0
ndm                           0
tp                           57
fp                            1
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 131), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)131_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 694
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 694 weight vectors
  Containing 200 true matches and 494 true non-matches
    (28.82% true matches)
  Identified 644 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   610  (94.72%)
          2 :    31  (4.81%)
          3 :     2  (0.31%)
         16 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 644 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 473

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 693
  Number of unique weight vectors: 644

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (644, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 644 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 644 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 561 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 151 matches and 410 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (410, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 410 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 410 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [1.000, 0.000, 0.700, 0.429, 0.476, 0.647, 0.810] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 2 matches and 69 non-matches
    Purity of oracle classification:  0.972
    Entropy of oracle classification: 0.185
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)196_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 196), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)196_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 845
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 845 weight vectors
  Containing 227 true matches and 618 true non-matches
    (26.86% true matches)
  Identified 788 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   751  (95.30%)
          2 :    34  (4.31%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 788 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 597

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 844
  Number of unique weight vectors: 788

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (788, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 788 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 788 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 703 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 162 matches and 541 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (541, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 162 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)996_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 996), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)996_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 665
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 665 weight vectors
  Containing 204 true matches and 461 true non-matches
    (30.68% true matches)
  Identified 616 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   582  (94.48%)
          2 :    31  (5.03%)
          3 :     2  (0.32%)
         15 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 616 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 440

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 664
  Number of unique weight vectors: 616

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (616, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 616 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 616 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 533 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 125 matches and 408 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (125, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (408, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 408 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 408 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 14 matches and 57 non-matches
    Purity of oracle classification:  0.803
    Entropy of oracle classification: 0.716
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)692_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 692), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)692_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)200_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 200), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)200_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 732
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 732 weight vectors
  Containing 219 true matches and 513 true non-matches
    (29.92% true matches)
  Identified 677 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   641  (94.68%)
          2 :    33  (4.87%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 677 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 492

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 731
  Number of unique weight vectors: 677

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (677, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 677 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 677 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 593 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 148 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (445, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 148 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 51 matches and 3 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)859_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 859), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)859_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 950 non-matches

46.0
Analisando o arquivo: diverg(20)457_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 457), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)457_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1035
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1035 weight vectors
  Containing 223 true matches and 812 true non-matches
    (21.55% true matches)
  Identified 981 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   944  (96.23%)
          2 :    34  (3.47%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 981 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 791

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1034
  Number of unique weight vectors: 981

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (981, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 981 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 981 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 894 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 156 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (738, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 738 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 5 matches and 70 non-matches
    Purity of oracle classification:  0.933
    Entropy of oracle classification: 0.353
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)229_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981481
recall                 0.177258
f-measure              0.300283
da                           54
dm                            0
ndm                           0
tp                           53
fp                            1
tn                  4.76529e+07
fn                          246
Name: (15, 1 - acm diverg, 229), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)229_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1067
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1067 weight vectors
  Containing 213 true matches and 854 true non-matches
    (19.96% true matches)
  Identified 1013 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   978  (96.54%)
          2 :    32  (3.16%)
          3 :     2  (0.20%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1013 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 833

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1066
  Number of unique weight vectors: 1013

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1013, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1013 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1013 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 926 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 163 matches and 763 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (163, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (763, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 163 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 163 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 42 matches and 15 non-matches
    Purity of oracle classification:  0.737
    Entropy of oracle classification: 0.831
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  15
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

54.0
Analisando o arquivo: diverg(20)685_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 685), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)685_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 961
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 961 weight vectors
  Containing 217 true matches and 744 true non-matches
    (22.58% true matches)
  Identified 906 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   870  (96.03%)
          2 :    33  (3.64%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 906 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 723

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 960
  Number of unique weight vectors: 906

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (906, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 906 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 906 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 819 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 135 matches and 684 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (684, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 684 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 684 vectors
  The selected farthest weight vectors are:
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 13 matches and 59 non-matches
    Purity of oracle classification:  0.819
    Entropy of oracle classification: 0.681
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)667_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 667), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)667_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 921
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 921 weight vectors
  Containing 203 true matches and 718 true non-matches
    (22.04% true matches)
  Identified 870 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   836  (96.09%)
          2 :    31  (3.56%)
          3 :     2  (0.23%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 870 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 697

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 920
  Number of unique weight vectors: 870

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (870, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 870 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 870 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 784 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 122 matches and 662 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (122, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (662, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 122 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 122 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(10)728_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 728), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)728_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 586
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 586 weight vectors
  Containing 204 true matches and 382 true non-matches
    (34.81% true matches)
  Identified 553 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   539  (97.47%)
          2 :    11  (1.99%)
          3 :     2  (0.36%)
         19 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 553 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 381

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 585
  Number of unique weight vectors: 553

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (553, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 553 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 553 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.714, 0.353, 0.583, 0.571] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 26 matches and 56 non-matches
    Purity of oracle classification:  0.683
    Entropy of oracle classification: 0.901
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 471 weight vectors
  Based on 26 matches and 56 non-matches
  Classified 128 matches and 343 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (128, 0.6829268292682927, 0.9011701959974223, 0.3170731707317073)
    (343, 0.6829268292682927, 0.9011701959974223, 0.3170731707317073)

Current size of match and non-match training data sets: 26 / 56

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 128 weight vectors
- Estimated match proportion 0.317

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 128 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 50 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.139
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)695_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 695), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)695_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 693
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 693 weight vectors
  Containing 194 true matches and 499 true non-matches
    (27.99% true matches)
  Identified 669 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   652  (97.46%)
          2 :    14  (2.09%)
          3 :     2  (0.30%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 669 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.000 : 497

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 693
  Number of unique weight vectors: 669

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (669, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 669 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 669 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 33 matches and 51 non-matches
    Purity of oracle classification:  0.607
    Entropy of oracle classification: 0.967
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 585 weight vectors
  Based on 33 matches and 51 non-matches
  Classified 277 matches and 308 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (277, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)
    (308, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)

Current size of match and non-match training data sets: 33 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.97
- Size 308 weight vectors
- Estimated match proportion 0.393

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 308 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.462, 0.667, 0.600, 0.389, 0.615] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)545_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976923
recall                 0.424749
f-measure              0.592075
da                          130
dm                            0
ndm                           0
tp                          127
fp                            3
tn                  4.76529e+07
fn                          172
Name: (10, 1 - acm diverg, 545), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)545_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 541
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 541 weight vectors
  Containing 130 true matches and 411 true non-matches
    (24.03% true matches)
  Identified 510 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   482  (94.51%)
          2 :    25  (4.90%)
          3 :     3  (0.59%)

Identified 0 non-pure unique weight vectors (from 510 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 119
     0.000 : 391

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 541
  Number of unique weight vectors: 510

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (510, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 510 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 510 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 26 matches and 55 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.905
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 429 weight vectors
  Based on 26 matches and 55 non-matches
  Classified 116 matches and 313 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (116, 0.6790123456790124, 0.9054522631867894, 0.32098765432098764)
    (313, 0.6790123456790124, 0.9054522631867894, 0.32098765432098764)

Current size of match and non-match training data sets: 26 / 55

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 313 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 313 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.556, 0.348, 0.467, 0.636, 0.412] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.800, 0.667, 0.381, 0.550, 0.429] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.800, 0.000, 0.444, 0.545, 0.333, 0.111, 0.533] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.462, 0.667, 0.636, 0.368, 0.500] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 0 matches and 66 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

130.0
Analisando o arquivo: diverg(20)149_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 149), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)149_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 123 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)532_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 532), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)532_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(10)579_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (10, 1 - acm diverg, 579), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)579_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 671
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 671 weight vectors
  Containing 142 true matches and 529 true non-matches
    (21.16% true matches)
  Identified 655 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   644  (98.32%)
          2 :     8  (1.22%)
          3 :     2  (0.31%)
          5 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 655 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 128
     0.000 : 527

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 671
  Number of unique weight vectors: 655

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (655, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 655 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 655 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 571 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 80 matches and 491 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (80, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (491, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 491 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 491 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 10 matches and 65 non-matches
    Purity of oracle classification:  0.867
    Entropy of oracle classification: 0.567
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(15)907_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 907), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)907_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 595
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 595 weight vectors
  Containing 207 true matches and 388 true non-matches
    (34.79% true matches)
  Identified 561 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   543  (96.79%)
          2 :    15  (2.67%)
          3 :     2  (0.36%)
         16 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 561 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 385

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 594
  Number of unique weight vectors: 561

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (561, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 561 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 561 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 479 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 153 matches and 326 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (326, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 326 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 326 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.846, 0.737, 0.706, 0.583, 0.800] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.813, 0.619, 0.333, 0.500, 0.571] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.778, 0.429, 0.571, 0.750, 0.600] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.750, 0.905, 0.667, 0.500, 0.571] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.615, 0.826, 0.286, 0.857, 0.643] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [0.667, 0.000, 0.500, 0.600, 0.353, 0.611, 0.526] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 0.000, 0.778, 0.677, 0.773, 0.161, 0.238] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)192_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 192), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)192_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 797
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 797 weight vectors
  Containing 224 true matches and 573 true non-matches
    (28.11% true matches)
  Identified 758 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   739  (97.49%)
          2 :    16  (2.11%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 758 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 570

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 796
  Number of unique weight vectors: 758

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (758, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 758 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 758 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 673 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 149 matches and 524 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (524, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 524 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 524 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.433, 0.667, 0.500, 0.636, 0.421] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 4 matches and 71 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.300
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)565_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987952
recall                 0.274247
f-measure              0.429319
da                           83
dm                            0
ndm                           0
tp                           82
fp                            1
tn                  4.76529e+07
fn                          217
Name: (10, 1 - acm diverg, 565), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)565_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 622
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 622 weight vectors
  Containing 171 true matches and 451 true non-matches
    (27.49% true matches)
  Identified 583 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   553  (94.85%)
          2 :    27  (4.63%)
          3 :     2  (0.34%)
          9 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 583 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 430

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 613
  Number of unique weight vectors: 582

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (582, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 582 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 582 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 500 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 173 matches and 327 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (173, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (327, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 327 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 327 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.556, 0.348, 0.467, 0.636, 0.412] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.269, 0.478, 0.750, 0.385, 0.455] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.800, 0.667, 0.381, 0.550, 0.429] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.667, 0.286, 0.556, 0.259, 0.250] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 0 matches and 69 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

83.0
Analisando o arquivo: diverg(20)630_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 630), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)630_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)921_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 921), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)921_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1086
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1086 weight vectors
  Containing 214 true matches and 872 true non-matches
    (19.71% true matches)
  Identified 1032 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   997  (96.61%)
          2 :    32  (3.10%)
          3 :     2  (0.19%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1032 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 851

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1085
  Number of unique weight vectors: 1032

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1032, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1032 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1032 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 944 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 98 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 846 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 846 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)545_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 545), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)545_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)674_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 674), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)674_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 960
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 960 weight vectors
  Containing 217 true matches and 743 true non-matches
    (22.60% true matches)
  Identified 905 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   869  (96.02%)
          2 :    33  (3.65%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 905 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 722

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 959
  Number of unique weight vectors: 905

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (905, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 905 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 905 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 818 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 135 matches and 683 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (683, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 683 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 683 vectors
  The selected farthest weight vectors are:
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 13 matches and 59 non-matches
    Purity of oracle classification:  0.819
    Entropy of oracle classification: 0.681
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)266_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 266), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)266_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 896
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 896 weight vectors
  Containing 201 true matches and 695 true non-matches
    (22.43% true matches)
  Identified 847 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   813  (95.99%)
          2 :    31  (3.66%)
          3 :     2  (0.24%)
         15 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 847 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 674

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 895
  Number of unique weight vectors: 847

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (847, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 847 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 761 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 156 matches and 605 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (605, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 605 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 605 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)573_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 573), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)573_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 829
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 829 weight vectors
  Containing 227 true matches and 602 true non-matches
    (27.38% true matches)
  Identified 772 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   735  (95.21%)
          2 :    34  (4.40%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 772 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 581

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 828
  Number of unique weight vectors: 772

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (772, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 772 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 772 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 687 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (537, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 150 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 51 matches and 3 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)949_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (20, 1 - acm diverg, 949), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)949_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 201 true matches and 752 true non-matches
    (21.09% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.26%)
          2 :    31  (3.41%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 115 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (115, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 115 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 115 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 46 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)472_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 472), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)472_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 774
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 774 weight vectors
  Containing 197 true matches and 577 true non-matches
    (25.45% true matches)
  Identified 732 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   697  (95.22%)
          2 :    32  (4.37%)
          3 :     2  (0.27%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 732 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.000 : 557

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 774
  Number of unique weight vectors: 732

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (732, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 732 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 732 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 647 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 143 matches and 504 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (504, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 504 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 504 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 3 matches and 68 non-matches
    Purity of oracle classification:  0.958
    Entropy of oracle classification: 0.253
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)746_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (20, 1 - acm diverg, 746), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)746_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 201 true matches and 752 true non-matches
    (21.09% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.26%)
          2 :    31  (3.41%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 119 matches and 702 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (119, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (702, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 702 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 702 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 13 matches and 59 non-matches
    Purity of oracle classification:  0.819
    Entropy of oracle classification: 0.681
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)747_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 747), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)747_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 375
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 375 weight vectors
  Containing 205 true matches and 170 true non-matches
    (54.67% true matches)
  Identified 344 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   330  (95.93%)
          2 :    11  (3.20%)
          3 :     2  (0.58%)
         17 :     1  (0.29%)

Identified 1 non-pure unique weight vectors (from 344 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 169

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 374
  Number of unique weight vectors: 344

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (344, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 344 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 344 vectors
  The selected farthest weight vectors are:
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 45 matches and 30 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  30
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 269 weight vectors
  Based on 45 matches and 30 non-matches
  Classified 158 matches and 111 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6, 0.9709505944546686, 0.6)
    (111, 0.6, 0.9709505944546686, 0.6)

Current size of match and non-match training data sets: 45 / 30

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 111 weight vectors
- Estimated match proportion 0.600

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 111 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.917, 1.000, 0.216, 0.231, 0.206, 0.178, 0.148] (False)
    [0.750, 1.000, 0.220, 0.208, 0.250, 0.250, 0.148] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.739, 1.000, 0.130, 0.171, 0.171, 0.120, 0.122] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.800, 1.000, 0.237, 0.243, 0.185, 0.095, 0.125] (False)
    [0.817, 1.000, 0.250, 0.212, 0.256, 0.045, 0.250] (False)
    [0.600, 0.944, 0.250, 0.200, 0.186, 0.136, 0.118] (False)
    [0.881, 1.000, 0.211, 0.250, 0.129, 0.250, 0.211] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.750, 1.000, 0.146, 0.130, 0.176, 0.318, 0.167] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.800, 1.000, 0.211, 0.133, 0.074, 0.133, 0.185] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.731, 1.000, 0.235, 0.214, 0.133, 0.083, 0.222] (False)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.635, 1.000, 0.179, 0.265, 0.167, 0.121, 0.241] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.652, 1.000, 0.174, 0.188, 0.167, 0.167, 0.063] (False)
    [0.733, 1.000, 0.139, 0.238, 0.186, 0.250, 0.238] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.592, 1.000, 0.179, 0.205, 0.156, 0.273, 0.180] (False)
    [0.518, 1.000, 0.179, 0.245, 0.111, 0.182, 0.103] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.890, 1.000, 0.281, 0.136, 0.183, 0.250, 0.163] (False)
    [0.902, 1.000, 0.182, 0.071, 0.182, 0.222, 0.190] (False)
    [0.929, 1.000, 0.182, 0.238, 0.188, 0.146, 0.270] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.663, 1.000, 0.132, 0.143, 0.241, 0.174, 0.167] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.800, 1.000, 0.111, 0.200, 0.100, 0.194, 0.094] (False)
    [0.747, 1.000, 0.231, 0.167, 0.107, 0.222, 0.125] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 0 matches and 51 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)756_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 756), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)756_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 148 matches and 784 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (784, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 148 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)417_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 417), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)417_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 395
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 395 weight vectors
  Containing 213 true matches and 182 true non-matches
    (53.92% true matches)
  Identified 358 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   340  (94.97%)
          2 :    15  (4.19%)
          3 :     2  (0.56%)
         19 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 358 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 179

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 394
  Number of unique weight vectors: 358

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (358, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 358 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 358 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 31 matches and 45 non-matches
    Purity of oracle classification:  0.592
    Entropy of oracle classification: 0.975
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 282 weight vectors
  Based on 31 matches and 45 non-matches
  Classified 151 matches and 131 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.5921052631578947, 0.9753817903274212, 0.40789473684210525)
    (131, 0.5921052631578947, 0.9753817903274212, 0.40789473684210525)

Current size of match and non-match training data sets: 31 / 45

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 151 weight vectors
- Estimated match proportion 0.408

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 52 matches and 6 non-matches
    Purity of oracle classification:  0.897
    Entropy of oracle classification: 0.480
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)825_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 825), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)825_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 380
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 380 weight vectors
  Containing 197 true matches and 183 true non-matches
    (51.84% true matches)
  Identified 351 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   338  (96.30%)
          2 :    10  (2.85%)
          3 :     2  (0.57%)
         16 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 351 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 182

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 379
  Number of unique weight vectors: 351

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (351, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 351 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 351 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 41 matches and 34 non-matches
    Purity of oracle classification:  0.547
    Entropy of oracle classification: 0.994
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  34
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 276 weight vectors
  Based on 41 matches and 34 non-matches
  Classified 135 matches and 141 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.5466666666666666, 0.993707106604508, 0.5466666666666666)
    (141, 0.5466666666666666, 0.993707106604508, 0.5466666666666666)

Current size of match and non-match training data sets: 41 / 34

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 141 weight vectors
- Estimated match proportion 0.547

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.146, 0.130, 0.176, 0.318, 0.167] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.821, 1.000, 0.275, 0.297, 0.227, 0.255, 0.152] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.261, 0.174, 0.148, 0.186, 0.148] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.881, 1.000, 0.211, 0.250, 0.129, 0.250, 0.211] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.902, 1.000, 0.182, 0.071, 0.182, 0.222, 0.190] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.750, 1.000, 0.243, 0.243, 0.214, 0.111, 0.132] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.929, 1.000, 0.250, 0.193, 0.250, 0.164, 0.213] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.592, 1.000, 0.179, 0.205, 0.156, 0.273, 0.180] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.663, 1.000, 0.132, 0.143, 0.241, 0.174, 0.167] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.747, 1.000, 0.231, 0.167, 0.107, 0.222, 0.125] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 6 matches and 51 non-matches
    Purity of oracle classification:  0.895
    Entropy of oracle classification: 0.485
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(15)865_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 865), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)865_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 806
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 806 weight vectors
  Containing 221 true matches and 585 true non-matches
    (27.42% true matches)
  Identified 750 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   714  (95.20%)
          2 :    33  (4.40%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 750 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 564

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 805
  Number of unique weight vectors: 750

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (750, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 750 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 750 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 665 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 154 matches and 511 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (154, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (511, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 511 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 511 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.462, 0.609, 0.684, 0.308, 0.545] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.791, 1.000, 0.275, 0.269, 0.192, 0.084, 0.200] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 4 matches and 70 non-matches
    Purity of oracle classification:  0.946
    Entropy of oracle classification: 0.303
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)562_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 562), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)562_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 233
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 233 weight vectors
  Containing 204 true matches and 29 true non-matches
    (87.55% true matches)
  Identified 203 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   189  (93.10%)
          2 :    11  (5.42%)
          3 :     2  (0.99%)
         16 :     1  (0.49%)

Identified 1 non-pure unique weight vectors (from 203 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 28

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 232
  Number of unique weight vectors: 203

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (203, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 203 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 65

Perform initial selection using "far" method

Farthest first selection of 65 weight vectors from 203 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 43 matches and 22 non-matches
    Purity of oracle classification:  0.662
    Entropy of oracle classification: 0.923
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  22
    Number of false non-matches: 0

Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 138 weight vectors
  Based on 43 matches and 22 non-matches
  Classified 138 matches and 0 non-matches

43.0
Analisando o arquivo: diverg(15)130_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 130), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)130_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 955
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 955 weight vectors
  Containing 216 true matches and 739 true non-matches
    (22.62% true matches)
  Identified 900 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   864  (96.00%)
          2 :    33  (3.67%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 900 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 718

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 954
  Number of unique weight vectors: 900

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (900, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 900 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 900 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 814 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 112 matches and 702 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (702, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 702 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 702 vectors
  The selected farthest weight vectors are:
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 16 matches and 53 non-matches
    Purity of oracle classification:  0.768
    Entropy of oracle classification: 0.781
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)128_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 128), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)128_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 147 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (537, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 537 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 537 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.556, 0.429, 0.500, 0.700, 0.643] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 7 matches and 68 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.447
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)69_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 69), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)69_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1099
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1099 weight vectors
  Containing 227 true matches and 872 true non-matches
    (20.66% true matches)
  Identified 1042 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1005  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1042 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 851

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1098
  Number of unique weight vectors: 1042

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1042, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1042 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1042 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 954 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 845 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (845, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)541_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 541), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)541_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 792
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 792 weight vectors
  Containing 222 true matches and 570 true non-matches
    (28.03% true matches)
  Identified 738 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   701  (94.99%)
          2 :    34  (4.61%)
          3 :     2  (0.27%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 738 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 549

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 791
  Number of unique weight vectors: 738

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (738, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 738 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 653 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 156 matches and 497 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (497, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 497 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 497 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 5 matches and 68 non-matches
    Purity of oracle classification:  0.932
    Entropy of oracle classification: 0.360
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)904_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 904), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)904_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)677_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (10, 1 - acm diverg, 677), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)677_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 669
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 669 weight vectors
  Containing 142 true matches and 527 true non-matches
    (21.23% true matches)
  Identified 653 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   642  (98.32%)
          2 :     8  (1.23%)
          3 :     2  (0.31%)
          5 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 653 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 128
     0.000 : 525

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 669
  Number of unique weight vectors: 653

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (653, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 653 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 653 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 30 matches and 53 non-matches
    Purity of oracle classification:  0.639
    Entropy of oracle classification: 0.944
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 570 weight vectors
  Based on 30 matches and 53 non-matches
  Classified 84 matches and 486 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (84, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)
    (486, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)

Current size of match and non-match training data sets: 30 / 53

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 84 weight vectors
- Estimated match proportion 0.361

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 84 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(20)959_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 959), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)959_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)110_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (20, 1 - acm diverg, 110), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)110_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 963
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 963 weight vectors
  Containing 212 true matches and 751 true non-matches
    (22.01% true matches)
  Identified 910 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   875  (96.15%)
          2 :    32  (3.52%)
          3 :     2  (0.22%)
         18 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 910 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 730

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 962
  Number of unique weight vectors: 910

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (910, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 910 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 910 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 823 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 117 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (117, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 117 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 117 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(20)798_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 798), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)798_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 950 non-matches

46.0
Analisando o arquivo: diverg(20)676_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 676), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)676_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)908_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 908), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)908_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 775
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 775 weight vectors
  Containing 197 true matches and 578 true non-matches
    (25.42% true matches)
  Identified 733 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   698  (95.23%)
          2 :    32  (4.37%)
          3 :     2  (0.27%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 733 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.000 : 558

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 775
  Number of unique weight vectors: 733

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (733, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 733 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 733 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 648 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 143 matches and 505 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (505, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 505 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 505 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 3 matches and 68 non-matches
    Purity of oracle classification:  0.958
    Entropy of oracle classification: 0.253
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)413_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 413), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)413_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 227 true matches and 848 true non-matches
    (21.12% true matches)
  Identified 1018 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   981  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1018 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1018

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1018, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1018 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1018 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 931 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 819 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (819, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 819 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 819 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)220_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 220), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)220_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 806
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 806 weight vectors
  Containing 226 true matches and 580 true non-matches
    (28.04% true matches)
  Identified 767 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   748  (97.52%)
          2 :    16  (2.09%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 767 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 577

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 805
  Number of unique weight vectors: 767

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (767, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 767 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 767 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 682 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 541 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (541, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 541 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 541 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 12 matches and 61 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.645
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)540_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976562
recall                  0.41806
f-measure               0.58548
da                          128
dm                            0
ndm                           0
tp                          125
fp                            3
tn                  4.76529e+07
fn                          174
Name: (10, 1 - acm diverg, 540), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)540_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 560
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 560 weight vectors
  Containing 133 true matches and 427 true non-matches
    (23.75% true matches)
  Identified 529 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   501  (94.71%)
          2 :    25  (4.73%)
          3 :     3  (0.57%)

Identified 0 non-pure unique weight vectors (from 529 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 122
     0.000 : 407

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 560
  Number of unique weight vectors: 529

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (529, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 529 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 529 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 25 matches and 56 non-matches
    Purity of oracle classification:  0.691
    Entropy of oracle classification: 0.892
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 448 weight vectors
  Based on 25 matches and 56 non-matches
  Classified 79 matches and 369 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (79, 0.691358024691358, 0.8915996278279094, 0.30864197530864196)
    (369, 0.691358024691358, 0.8915996278279094, 0.30864197530864196)

Current size of match and non-match training data sets: 25 / 56

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 79 weight vectors
- Estimated match proportion 0.309

Sample size for this cluster: 41

Farthest first selection of 41 weight vectors from 79 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 41 weight vectors
  The oracle will correctly classify 41 weight vectors and wrongly classify 0
  Classified 41 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 41 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

128.0
Analisando o arquivo: diverg(20)227_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 227), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)227_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1077
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1077 weight vectors
  Containing 221 true matches and 856 true non-matches
    (20.52% true matches)
  Identified 1021 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   985  (96.47%)
          2 :    33  (3.23%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1021 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 835

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1076
  Number of unique weight vectors: 1021

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1021, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1021 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1021 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 934 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 155 matches and 779 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (779, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 155 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 155 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 47 matches and 8 non-matches
    Purity of oracle classification:  0.855
    Entropy of oracle classification: 0.598
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)171_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (20, 1 - acm diverg, 171), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)171_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1026
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1026 weight vectors
  Containing 198 true matches and 828 true non-matches
    (19.30% true matches)
  Identified 984 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   949  (96.44%)
          2 :    32  (3.25%)
          3 :     2  (0.20%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 984 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 808

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 984

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (984, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 984 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 984 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 897 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 93 matches and 804 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (93, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (804, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 93 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 93 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)834_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 834), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)834_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1015
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1015 weight vectors
  Containing 221 true matches and 794 true non-matches
    (21.77% true matches)
  Identified 961 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   924  (96.15%)
          2 :    34  (3.54%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 961 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 773

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1014
  Number of unique weight vectors: 961

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (961, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 961 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 961 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 874 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 297 matches and 577 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (297, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (577, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 297 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 297 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 47 matches and 21 non-matches
    Purity of oracle classification:  0.691
    Entropy of oracle classification: 0.892
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  21
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)787_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 787), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)787_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 188
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 188 weight vectors
  Containing 163 true matches and 25 true non-matches
    (86.70% true matches)
  Identified 170 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   158  (92.94%)
          2 :     9  (5.29%)
          3 :     2  (1.18%)
          6 :     1  (0.59%)

Identified 0 non-pure unique weight vectors (from 170 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 145
     0.000 : 25

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 188
  Number of unique weight vectors: 170

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (170, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 170 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 61

Perform initial selection using "far" method

Farthest first selection of 61 weight vectors from 170 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 61 weight vectors
  The oracle will correctly classify 61 weight vectors and wrongly classify 0
  Classified 42 matches and 19 non-matches
    Purity of oracle classification:  0.689
    Entropy of oracle classification: 0.895
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  19
    Number of false non-matches: 0

Deleted 61 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 109 weight vectors
  Based on 42 matches and 19 non-matches
  Classified 109 matches and 0 non-matches

71.0
Analisando o arquivo: diverg(10)839_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 839), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)839_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 673
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 673 weight vectors
  Containing 181 true matches and 492 true non-matches
    (26.89% true matches)
  Identified 652 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   642  (98.47%)
          2 :     7  (1.07%)
          3 :     2  (0.31%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 652 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 160
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 491

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 672
  Number of unique weight vectors: 652

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (652, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 652 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 652 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 31 matches and 52 non-matches
    Purity of oracle classification:  0.627
    Entropy of oracle classification: 0.953
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 569 weight vectors
  Based on 31 matches and 52 non-matches
  Classified 295 matches and 274 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (295, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)
    (274, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)

Current size of match and non-match training data sets: 31 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 295 weight vectors
- Estimated match proportion 0.373

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 295 vectors
  The selected farthest weight vectors are:
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.890, 1.000, 0.281, 0.136, 0.183, 0.250, 0.163] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 41 matches and 28 non-matches
    Purity of oracle classification:  0.594
    Entropy of oracle classification: 0.974
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)90_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 90), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)90_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 543 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 543 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 12 matches and 61 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.645
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)106_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 106), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)106_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 701
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 701 weight vectors
  Containing 216 true matches and 485 true non-matches
    (30.81% true matches)
  Identified 646 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   610  (94.43%)
          2 :    33  (5.11%)
          3 :     2  (0.31%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 646 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 464

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 700
  Number of unique weight vectors: 646

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (646, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 646 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 646 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 30 matches and 53 non-matches
    Purity of oracle classification:  0.639
    Entropy of oracle classification: 0.944
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 563 weight vectors
  Based on 30 matches and 53 non-matches
  Classified 202 matches and 361 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (202, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)
    (361, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)

Current size of match and non-match training data sets: 30 / 53

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 361 weight vectors
- Estimated match proportion 0.361

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 361 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.700, 0.429, 0.476, 0.647, 0.810] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 0.000, 0.600, 0.857, 0.579, 0.286, 0.545] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)35_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 35), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)35_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1045
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1045 weight vectors
  Containing 214 true matches and 831 true non-matches
    (20.48% true matches)
  Identified 991 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   956  (96.47%)
          2 :    32  (3.23%)
          3 :     2  (0.20%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 991 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 810

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1044
  Number of unique weight vectors: 991

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (991, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 991 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 991 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 904 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 166 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (166, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (738, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 166 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 166 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 43 matches and 14 non-matches
    Purity of oracle classification:  0.754
    Entropy of oracle classification: 0.804
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  14
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)609_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 609), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)609_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)223_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984848
recall                 0.217391
f-measure              0.356164
da                           66
dm                            0
ndm                           0
tp                           65
fp                            1
tn                  4.76529e+07
fn                          234
Name: (10, 1 - acm diverg, 223), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)223_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 501
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 501 weight vectors
  Containing 176 true matches and 325 true non-matches
    (35.13% true matches)
  Identified 476 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   465  (97.69%)
          2 :     8  (1.68%)
          3 :     2  (0.42%)
         14 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 476 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 151
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 324

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 500
  Number of unique weight vectors: 476

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (476, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 476 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 476 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 29 matches and 51 non-matches
    Purity of oracle classification:  0.637
    Entropy of oracle classification: 0.945
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 396 weight vectors
  Based on 29 matches and 51 non-matches
  Classified 130 matches and 266 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.6375, 0.944738828646789, 0.3625)
    (266, 0.6375, 0.944738828646789, 0.3625)

Current size of match and non-match training data sets: 29 / 51

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 266 weight vectors
- Estimated match proportion 0.362

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 266 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.864, 0.667, 0.435, 0.700, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.846, 0.857, 0.353, 0.318, 0.400] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.615, 0.714, 0.353, 0.583, 0.571] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 0 matches and 67 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

66.0
Analisando o arquivo: diverg(10)889_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 889), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)889_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 887
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 887 weight vectors
  Containing 202 true matches and 685 true non-matches
    (22.77% true matches)
  Identified 838 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   804  (95.94%)
          2 :    31  (3.70%)
          3 :     2  (0.24%)
         15 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 838 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 664

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 886
  Number of unique weight vectors: 838

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (838, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 838 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 838 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 752 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 170 matches and 582 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (170, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (582, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 582 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 582 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(10)521_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 521), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)521_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 217
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 217 weight vectors
  Containing 180 true matches and 37 true non-matches
    (82.95% true matches)
  Identified 199 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   187  (93.97%)
          2 :     9  (4.52%)
          3 :     2  (1.01%)
          6 :     1  (0.50%)

Identified 0 non-pure unique weight vectors (from 199 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.000 : 37

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 217
  Number of unique weight vectors: 199

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (199, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 199 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 65

Perform initial selection using "far" method

Farthest first selection of 65 weight vectors from 199 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 39 matches and 26 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 134 weight vectors
  Based on 39 matches and 26 non-matches
  Classified 134 matches and 0 non-matches

69.0
Analisando o arquivo: diverg(15)476_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (15, 1 - acm diverg, 476), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)476_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1031
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1031 weight vectors
  Containing 203 true matches and 828 true non-matches
    (19.69% true matches)
  Identified 981 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   947  (96.53%)
          2 :    31  (3.16%)
          3 :     2  (0.20%)
         16 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 981 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 807

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1030
  Number of unique weight vectors: 981

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (981, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 981 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 981 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 894 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 101 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 793 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(20)465_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 465), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)465_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 123 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)319_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 319), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)319_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1082
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1082 weight vectors
  Containing 226 true matches and 856 true non-matches
    (20.89% true matches)
  Identified 1025 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   988  (96.39%)
          2 :    34  (3.32%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1025 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 835

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1081
  Number of unique weight vectors: 1025

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1025, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1025 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1025 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 29 matches and 58 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 29 matches and 58 non-matches
  Classified 159 matches and 779 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (779, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 29 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 779 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 779 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.667, 0.500, 0.647, 0.556, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.750, 0.429, 0.526, 0.500, 0.846] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.462, 0.889, 0.455, 0.211, 0.375] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.412, 0.318, 0.421] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.233, 0.545, 0.714, 0.455, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 2 matches and 75 non-matches
    Purity of oracle classification:  0.974
    Entropy of oracle classification: 0.174
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)36_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976923
recall                 0.424749
f-measure              0.592075
da                          130
dm                            0
ndm                           0
tp                          127
fp                            3
tn                  4.76529e+07
fn                          172
Name: (10, 1 - acm diverg, 36), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)36_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 541
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 541 weight vectors
  Containing 130 true matches and 411 true non-matches
    (24.03% true matches)
  Identified 510 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   482  (94.51%)
          2 :    25  (4.90%)
          3 :     3  (0.59%)

Identified 0 non-pure unique weight vectors (from 510 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 119
     0.000 : 391

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 541
  Number of unique weight vectors: 510

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (510, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 510 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 510 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 26 matches and 55 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.905
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 429 weight vectors
  Based on 26 matches and 55 non-matches
  Classified 116 matches and 313 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (116, 0.6790123456790124, 0.9054522631867894, 0.32098765432098764)
    (313, 0.6790123456790124, 0.9054522631867894, 0.32098765432098764)

Current size of match and non-match training data sets: 26 / 55

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 313 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 313 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.556, 0.348, 0.467, 0.636, 0.412] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.800, 0.667, 0.381, 0.550, 0.429] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.800, 0.000, 0.444, 0.545, 0.333, 0.111, 0.533] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.462, 0.667, 0.636, 0.368, 0.500] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 0 matches and 66 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

130.0
Analisando o arquivo: diverg(15)715_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 715), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)715_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 611
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 611 weight vectors
  Containing 191 true matches and 420 true non-matches
    (31.26% true matches)
  Identified 585 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   570  (97.44%)
          2 :    12  (2.05%)
          3 :     2  (0.34%)
         11 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 585 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 417

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 610
  Number of unique weight vectors: 585

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (585, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 585 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 585 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 31 matches and 51 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 503 weight vectors
  Based on 31 matches and 51 non-matches
  Classified 143 matches and 360 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)
    (360, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)

Current size of match and non-match training data sets: 31 / 51

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 360 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 360 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.750, 0.905, 0.667, 0.500, 0.571] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.600, 0.700, 0.600, 0.611, 0.706] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.826, 0.286, 0.857, 0.643] (False)
    [1.000, 0.000, 0.625, 0.526, 0.300, 0.778, 0.609] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 1 matches and 71 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.106
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)455_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 455), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)455_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 581
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 581 weight vectors
  Containing 187 true matches and 394 true non-matches
    (32.19% true matches)
  Identified 559 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   543  (97.14%)
          2 :    13  (2.33%)
          3 :     2  (0.36%)
          6 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 559 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 392

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 581
  Number of unique weight vectors: 559

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (559, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 559 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 559 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 477 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 141 matches and 336 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (336, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 336 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 336 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.833, 0.833, 0.550, 0.500, 0.688] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.474, 0.692, 0.826, 0.484, 0.545] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.700, 0.536, 0.353, 0.647, 0.571] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.826, 0.286, 0.857, 0.643] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 5 matches and 65 non-matches
    Purity of oracle classification:  0.929
    Entropy of oracle classification: 0.371
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)759_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 759), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)759_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 221 true matches and 872 true non-matches
    (20.22% true matches)
  Identified 1037 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1037 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 851

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1037

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1037, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1037 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1037 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 949 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 846 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 846 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)772_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 772), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)772_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 442
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 442 weight vectors
  Containing 208 true matches and 234 true non-matches
    (47.06% true matches)
  Identified 409 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   395  (96.58%)
          2 :    11  (2.69%)
          3 :     2  (0.49%)
         19 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 409 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 233

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 441
  Number of unique weight vectors: 409

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (409, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 409 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 409 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 36 matches and 42 non-matches
    Purity of oracle classification:  0.538
    Entropy of oracle classification: 0.996
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  42
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 331 weight vectors
  Based on 36 matches and 42 non-matches
  Classified 133 matches and 198 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.5384615384615384, 0.9957274520849256, 0.46153846153846156)
    (198, 0.5384615384615384, 0.9957274520849256, 0.46153846153846156)

Current size of match and non-match training data sets: 36 / 42

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 1.00
- Size 198 weight vectors
- Estimated match proportion 0.462

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 198 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.791, 1.000, 0.275, 0.269, 0.192, 0.084, 0.200] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 10 matches and 54 non-matches
    Purity of oracle classification:  0.844
    Entropy of oracle classification: 0.625
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)270_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 270), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)270_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1050
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1050 weight vectors
  Containing 208 true matches and 842 true non-matches
    (19.81% true matches)
  Identified 1003 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   968  (96.51%)
          2 :    32  (3.19%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1003 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 821

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1049
  Number of unique weight vectors: 1003

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1003, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1003 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1003 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 916 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (793, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 793 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 12 matches and 58 non-matches
    Purity of oracle classification:  0.829
    Entropy of oracle classification: 0.661
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)489_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 489), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)489_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 836
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 836 weight vectors
  Containing 208 true matches and 628 true non-matches
    (24.88% true matches)
  Identified 789 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   754  (95.56%)
          2 :    32  (4.06%)
          3 :     2  (0.25%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 789 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 607

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 835
  Number of unique weight vectors: 789

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (789, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 789 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 789 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 25 matches and 60 non-matches
    Purity of oracle classification:  0.706
    Entropy of oracle classification: 0.874
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 704 weight vectors
  Based on 25 matches and 60 non-matches
  Classified 123 matches and 581 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)
    (581, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)

Current size of match and non-match training data sets: 25 / 60

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 581 weight vectors
- Estimated match proportion 0.294

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 581 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 16 matches and 54 non-matches
    Purity of oracle classification:  0.771
    Entropy of oracle classification: 0.776
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)377_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 377), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)377_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 946
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 946 weight vectors
  Containing 219 true matches and 727 true non-matches
    (23.15% true matches)
  Identified 891 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   855  (95.96%)
          2 :    33  (3.70%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 891 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 706

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 945
  Number of unique weight vectors: 891

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (891, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 891 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 891 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 805 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 130 matches and 675 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (675, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 130 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)524_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 524), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)524_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 544
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 544 weight vectors
  Containing 209 true matches and 335 true non-matches
    (38.42% true matches)
  Identified 513 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   498  (97.08%)
          2 :    12  (2.34%)
          3 :     2  (0.39%)
         16 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 513 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 334

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 543
  Number of unique weight vectors: 513

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (513, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 513 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 513 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 30 matches and 51 non-matches
    Purity of oracle classification:  0.630
    Entropy of oracle classification: 0.951
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 432 weight vectors
  Based on 30 matches and 51 non-matches
  Classified 151 matches and 281 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6296296296296297, 0.9509560484549725, 0.37037037037037035)
    (281, 0.6296296296296297, 0.9509560484549725, 0.37037037037037035)

Current size of match and non-match training data sets: 30 / 51

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 281 weight vectors
- Estimated match proportion 0.370

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 281 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.846, 0.684, 0.529, 0.727, 0.700] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.769, 0.714, 0.600, 0.412, 0.500] (False)
    [1.000, 0.000, 0.571, 0.867, 0.471, 0.583, 0.643] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 5 matches and 63 non-matches
    Purity of oracle classification:  0.926
    Entropy of oracle classification: 0.379
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)665_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (10, 1 - acm diverg, 665), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)665_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 621
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 621 weight vectors
  Containing 164 true matches and 457 true non-matches
    (26.41% true matches)
  Identified 605 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   595  (98.35%)
          2 :     7  (1.16%)
          3 :     2  (0.33%)
          6 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 605 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.000 : 457

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 621
  Number of unique weight vectors: 605

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (605, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 605 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 605 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 30 matches and 53 non-matches
    Purity of oracle classification:  0.639
    Entropy of oracle classification: 0.944
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 522 weight vectors
  Based on 30 matches and 53 non-matches
  Classified 112 matches and 410 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)
    (410, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)

Current size of match and non-match training data sets: 30 / 53

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 410 weight vectors
- Estimated match proportion 0.361

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 410 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.615, 0.714, 0.353, 0.583, 0.571] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.762, 0.714, 0.500, 0.400] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.556, 0.429, 0.500, 0.700, 0.643] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 4 matches and 69 non-matches
    Purity of oracle classification:  0.945
    Entropy of oracle classification: 0.306
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(15)927_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 927), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)927_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1043
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1043 weight vectors
  Containing 222 true matches and 821 true non-matches
    (21.28% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   952  (96.26%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 800

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1042
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 145 matches and 757 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (757, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 145 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)638_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 638), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)638_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 612
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 612 weight vectors
  Containing 211 true matches and 401 true non-matches
    (34.48% true matches)
  Identified 560 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   524  (93.57%)
          2 :    33  (5.89%)
          3 :     2  (0.36%)
         16 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 560 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 380

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 611
  Number of unique weight vectors: 560

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (560, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 560 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 560 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 478 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 172 matches and 306 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (172, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (306, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 172 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 172 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 48 matches and 10 non-matches
    Purity of oracle classification:  0.828
    Entropy of oracle classification: 0.663
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  10
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)40_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 40), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)40_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 936
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 936 weight vectors
  Containing 217 true matches and 719 true non-matches
    (23.18% true matches)
  Identified 881 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   845  (95.91%)
          2 :    33  (3.75%)
          3 :     2  (0.23%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 881 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 698

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 935
  Number of unique weight vectors: 881

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (881, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 881 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 881 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 795 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 133 matches and 662 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (662, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 133 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 133 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 49 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.141
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)818_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 818), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)818_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)31_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 31), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)31_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 794
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 794 weight vectors
  Containing 209 true matches and 585 true non-matches
    (26.32% true matches)
  Identified 747 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   712  (95.31%)
          2 :    32  (4.28%)
          3 :     2  (0.27%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 747 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 564

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 793
  Number of unique weight vectors: 747

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (747, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 747 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 747 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 662 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 155 matches and 507 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (507, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 507 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 507 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.462, 0.609, 0.684, 0.308, 0.545] (False)
    [0.817, 1.000, 0.250, 0.212, 0.256, 0.045, 0.250] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 2 matches and 72 non-matches
    Purity of oracle classification:  0.973
    Entropy of oracle classification: 0.179
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)568_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 568), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)568_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 734
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 734 weight vectors
  Containing 198 true matches and 536 true non-matches
    (26.98% true matches)
  Identified 692 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   657  (94.94%)
          2 :    32  (4.62%)
          3 :     2  (0.29%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 692 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 516

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 734
  Number of unique weight vectors: 692

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (692, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 692 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 692 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 608 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 136 matches and 472 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (472, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 136 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 49 matches and 2 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.239
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)708_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 708), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)708_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 791
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 791 weight vectors
  Containing 188 true matches and 603 true non-matches
    (23.77% true matches)
  Identified 749 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   718  (95.86%)
          2 :    28  (3.74%)
          3 :     2  (0.27%)
         11 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 749 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 582

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 790
  Number of unique weight vectors: 749

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (749, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 749 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 749 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 664 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 109 matches and 555 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (555, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 555 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 555 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 16 matches and 55 non-matches
    Purity of oracle classification:  0.775
    Entropy of oracle classification: 0.770
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)710_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 710), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)710_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 861
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 861 weight vectors
  Containing 227 true matches and 634 true non-matches
    (26.36% true matches)
  Identified 804 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   767  (95.40%)
          2 :    34  (4.23%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 804 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 860
  Number of unique weight vectors: 804

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (804, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 804 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 804 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.722, 0.471, 0.545, 0.579] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.556, 0.182, 0.500, 0.071, 0.400] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 0.963, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.344, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.300, 0.524, 0.727, 0.762] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 718 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 13 matches and 705 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (13, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (705, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 13 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 12

Farthest first selection of 12 weight vectors from 13 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.958, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.971, 0.952, 1.000] (True)
    [1.000, 1.000, 1.000, 0.952, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.980, 1.000] (True)
    [0.971, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.933, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 12 weight vectors
  The oracle will correctly classify 12 weight vectors and wrongly classify 0
  Classified 12 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 12 weight vectors (classified by oracle) from cluster

Cluster is pure enough and not too large, add its 13 weight vectors to:
  Match training set

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 3: Queue length: 1
  Number of manual oracle classifications performed: 98
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (705, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 37 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 705 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 705 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.857, 0.111, 0.444, 0.529, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.688, 0.571, 0.400, 0.529, 0.667] (False)
    [1.000, 0.000, 0.500, 0.875, 0.455, 0.333, 0.429] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 21 matches and 49 non-matches
    Purity of oracle classification:  0.700
    Entropy of oracle classification: 0.881
    Number of true matches:      21
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)305_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (15, 1 - acm diverg, 305), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)305_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 690
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 690 weight vectors
  Containing 178 true matches and 512 true non-matches
    (25.80% true matches)
  Identified 651 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   621  (95.39%)
          2 :    27  (4.15%)
          3 :     2  (0.31%)
          9 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 651 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 491

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 681
  Number of unique weight vectors: 650

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (650, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 650 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 650 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 26 matches and 57 non-matches
    Purity of oracle classification:  0.687
    Entropy of oracle classification: 0.897
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 567 weight vectors
  Based on 26 matches and 57 non-matches
  Classified 115 matches and 452 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (115, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)
    (452, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)

Current size of match and non-match training data sets: 26 / 57

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 115 weight vectors
- Estimated match proportion 0.313

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 115 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 46 matches and 2 non-matches
    Purity of oracle classification:  0.958
    Entropy of oracle classification: 0.250
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(20)9_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 9), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)9_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 156 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (800, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 156 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)379_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 379), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)379_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 714
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 714 weight vectors
  Containing 201 true matches and 513 true non-matches
    (28.15% true matches)
  Identified 669 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   635  (94.92%)
          2 :    31  (4.63%)
          3 :     2  (0.30%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 669 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 492

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 713
  Number of unique weight vectors: 669

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (669, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 669 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 669 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 585 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 123 matches and 462 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (462, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 123 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)431_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 431), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)431_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 123 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)398_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 398), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)398_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 460
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 460 weight vectors
  Containing 200 true matches and 260 true non-matches
    (43.48% true matches)
  Identified 428 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   412  (96.26%)
          2 :    13  (3.04%)
          3 :     2  (0.47%)
         16 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 428 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 257

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 459
  Number of unique weight vectors: 428

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (428, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 428 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 428 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 37 matches and 41 non-matches
    Purity of oracle classification:  0.526
    Entropy of oracle classification: 0.998
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 350 weight vectors
  Based on 37 matches and 41 non-matches
  Classified 136 matches and 214 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)
    (214, 0.5256410256410257, 0.9981021327390103, 0.47435897435897434)

Current size of match and non-match training data sets: 37 / 41

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 214 weight vectors
- Estimated match proportion 0.474

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 214 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 3 matches and 63 non-matches
    Purity of oracle classification:  0.955
    Entropy of oracle classification: 0.267
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(10)891_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 891), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)891_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 785
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 785 weight vectors
  Containing 212 true matches and 573 true non-matches
    (27.01% true matches)
  Identified 731 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   696  (95.21%)
          2 :    32  (4.38%)
          3 :     2  (0.27%)
         19 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 731 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 552

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 784
  Number of unique weight vectors: 731

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (731, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 731 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 731 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 31 matches and 54 non-matches
    Purity of oracle classification:  0.635
    Entropy of oracle classification: 0.947
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 646 weight vectors
  Based on 31 matches and 54 non-matches
  Classified 319 matches and 327 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (319, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)
    (327, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)

Current size of match and non-match training data sets: 31 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.95
- Size 319 weight vectors
- Estimated match proportion 0.365

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 319 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 42 matches and 28 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)703_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 703), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)703_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 927
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 927 weight vectors
  Containing 206 true matches and 721 true non-matches
    (22.22% true matches)
  Identified 874 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   839  (96.00%)
          2 :    32  (3.66%)
          3 :     2  (0.23%)
         18 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 874 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 700

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 926
  Number of unique weight vectors: 874

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (874, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 874 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 874 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 30 matches and 56 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 788 weight vectors
  Based on 30 matches and 56 non-matches
  Classified 194 matches and 594 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (194, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)
    (594, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)

Current size of match and non-match training data sets: 30 / 56

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 594 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 594 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.429, 0.632, 0.250, 0.750] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.333, 0.667, 0.400, 0.583, 0.563] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.444, 0.643, 0.421, 0.200, 0.556] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.350, 0.455, 0.625, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 0.857, 0.286, 0.500, 0.643, 0.600] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(20)528_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 528), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)528_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 754
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 754 weight vectors
  Containing 222 true matches and 532 true non-matches
    (29.44% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   699  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 529

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 753
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 146 matches and 488 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (488, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 488 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 488 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 8 matches and 67 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.490
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)702_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 702), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)702_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 652
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 652 weight vectors
  Containing 154 true matches and 498 true non-matches
    (23.62% true matches)
  Identified 616 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   588  (95.45%)
          2 :    25  (4.06%)
          3 :     2  (0.32%)
          8 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 616 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 477

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 644
  Number of unique weight vectors: 615

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (615, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 615 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 615 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 24 matches and 59 non-matches
    Purity of oracle classification:  0.711
    Entropy of oracle classification: 0.868
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 532 weight vectors
  Based on 24 matches and 59 non-matches
  Classified 98 matches and 434 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7108433734939759, 0.8676293117125106, 0.2891566265060241)
    (434, 0.7108433734939759, 0.8676293117125106, 0.2891566265060241)

Current size of match and non-match training data sets: 24 / 59

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 434 weight vectors
- Estimated match proportion 0.289

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 434 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.423, 0.478, 0.500, 0.813, 0.545] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.636, 0.429, 0.632, 0.250, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 12 matches and 55 non-matches
    Purity of oracle classification:  0.821
    Entropy of oracle classification: 0.678
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(15)968_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 968), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)968_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 945
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 945 weight vectors
  Containing 211 true matches and 734 true non-matches
    (22.33% true matches)
  Identified 892 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   857  (96.08%)
          2 :    32  (3.59%)
          3 :     2  (0.22%)
         18 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 892 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 713

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 944
  Number of unique weight vectors: 892

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (892, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 892 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 892 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 806 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 148 matches and 658 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (658, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 148 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)385_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 385), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)385_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 491
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 491 weight vectors
  Containing 222 true matches and 269 true non-matches
    (45.21% true matches)
  Identified 455 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   436  (95.82%)
          2 :    16  (3.52%)
          3 :     2  (0.44%)
         17 :     1  (0.22%)

Identified 1 non-pure unique weight vectors (from 455 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 266

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 490
  Number of unique weight vectors: 455

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (455, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 455 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 455 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 36 matches and 43 non-matches
    Purity of oracle classification:  0.544
    Entropy of oracle classification: 0.994
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 376 weight vectors
  Based on 36 matches and 43 non-matches
  Classified 151 matches and 225 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.5443037974683544, 0.9943290455933882, 0.45569620253164556)
    (225, 0.5443037974683544, 0.9943290455933882, 0.45569620253164556)

Current size of match and non-match training data sets: 36 / 43

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 0.99
- Size 225 weight vectors
- Estimated match proportion 0.456

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 225 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 5 matches and 62 non-matches
    Purity of oracle classification:  0.925
    Entropy of oracle classification: 0.383
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)120_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 120), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)120_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 539
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 539 weight vectors
  Containing 224 true matches and 315 true non-matches
    (41.56% true matches)
  Identified 500 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   481  (96.20%)
          2 :    16  (3.20%)
          3 :     2  (0.40%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 500 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 312

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 538
  Number of unique weight vectors: 500

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (500, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 500 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 500 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 33 matches and 47 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.978
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 420 weight vectors
  Based on 33 matches and 47 non-matches
  Classified 150 matches and 270 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.5875, 0.9777945702913884, 0.4125)
    (270, 0.5875, 0.9777945702913884, 0.4125)

Current size of match and non-match training data sets: 33 / 47

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 150 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.909, 1.000, 1.000, 1.000, 0.947] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 52 matches and 6 non-matches
    Purity of oracle classification:  0.897
    Entropy of oracle classification: 0.480
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)54_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (15, 1 - acm diverg, 54), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)54_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 919
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 919 weight vectors
  Containing 176 true matches and 743 true non-matches
    (19.15% true matches)
  Identified 880 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   850  (96.59%)
          2 :    27  (3.07%)
          3 :     2  (0.23%)
          9 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 880 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 157
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 722

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 910
  Number of unique weight vectors: 879

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (879, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 879 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 879 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 793 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 105 matches and 688 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (105, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (688, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 105 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 105 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(10)352_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976744
recall                 0.421405
f-measure              0.588785
da                          129
dm                            0
ndm                           0
tp                          126
fp                            3
tn                  4.76529e+07
fn                          173
Name: (10, 1 - acm diverg, 352), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)352_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 303
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 303 weight vectors
  Containing 133 true matches and 170 true non-matches
    (43.89% true matches)
  Identified 290 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   282  (97.24%)
          2 :     5  (1.72%)
          3 :     2  (0.69%)
          5 :     1  (0.34%)

Identified 0 non-pure unique weight vectors (from 290 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 120
     0.000 : 170

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 303
  Number of unique weight vectors: 290

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (290, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 290 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 72

Perform initial selection using "far" method

Farthest first selection of 72 weight vectors from 290 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 40 matches and 32 non-matches
    Purity of oracle classification:  0.556
    Entropy of oracle classification: 0.991
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  32
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 218 weight vectors
  Based on 40 matches and 32 non-matches
  Classified 213 matches and 5 non-matches

  Non-match cluster not large enough for required sample size
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 1
  Number of manual oracle classifications performed: 72
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (213, 0.5555555555555556, 0.9910760598382222, 0.5555555555555556)

Current size of match and non-match training data sets: 40 / 32

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 213 weight vectors
- Estimated match proportion 0.556

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 213 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.817, 1.000, 0.250, 0.212, 0.256, 0.045, 0.250] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 1.000, 1.000, 0.952, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.518, 1.000, 0.179, 0.245, 0.111, 0.182, 0.103] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.800, 1.000, 0.111, 0.200, 0.100, 0.194, 0.094] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 34 matches and 32 non-matches
    Purity of oracle classification:  0.515
    Entropy of oracle classification: 0.999
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  32
    Number of false non-matches: 0

Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

129.0
Analisando o arquivo: diverg(20)967_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 967), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)967_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)830_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                0.9875
recall                 0.264214
f-measure              0.416887
da                           80
dm                            0
ndm                           0
tp                           79
fp                            1
tn                  4.76529e+07
fn                          220
Name: (10, 1 - acm diverg, 830), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)830_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 790
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 790 weight vectors
  Containing 185 true matches and 605 true non-matches
    (23.42% true matches)
  Identified 748 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   717  (95.86%)
          2 :    28  (3.74%)
          3 :     2  (0.27%)
         11 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 748 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 163
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 584

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 789
  Number of unique weight vectors: 748

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (748, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 748 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 748 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 31 matches and 54 non-matches
    Purity of oracle classification:  0.635
    Entropy of oracle classification: 0.947
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 663 weight vectors
  Based on 31 matches and 54 non-matches
  Classified 305 matches and 358 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (305, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)
    (358, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)

Current size of match and non-match training data sets: 31 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.95
- Size 358 weight vectors
- Estimated match proportion 0.365

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 358 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.700, 0.429, 0.476, 0.647, 0.810] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.370, 0.818, 0.800, 0.550, 0.500] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.067, 0.550, 0.818, 0.727, 0.762] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.857, 0.875, 0.625, 0.333, 0.667] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

80.0
Analisando o arquivo: diverg(20)995_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 995), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)995_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1073
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1073 weight vectors
  Containing 226 true matches and 847 true non-matches
    (21.06% true matches)
  Identified 1016 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   979  (96.36%)
          2 :    34  (3.35%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1016 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 826

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1072
  Number of unique weight vectors: 1016

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1016, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1016 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1016 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 929 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 332 matches and 597 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (332, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (597, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 597 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 597 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.692, 0.583, 0.500, 0.750, 0.731] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)2_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 2), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)2_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 639
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 639 weight vectors
  Containing 187 true matches and 452 true non-matches
    (29.26% true matches)
  Identified 599 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   565  (94.32%)
          2 :    31  (5.18%)
          3 :     2  (0.33%)
          6 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 599 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 432

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 639
  Number of unique weight vectors: 599

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (599, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 599 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 599 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 516 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 147 matches and 369 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (369, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 369 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 369 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 1 matches and 68 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.109
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(15)451_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 451), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)451_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 793
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 793 weight vectors
  Containing 223 true matches and 570 true non-matches
    (28.12% true matches)
  Identified 739 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   702  (94.99%)
          2 :    34  (4.60%)
          3 :     2  (0.27%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 739 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 549

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 792
  Number of unique weight vectors: 739

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (739, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 739 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 739 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 654 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 157 matches and 497 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (497, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 157 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 157 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)779_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 779), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)779_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 706
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 706 weight vectors
  Containing 211 true matches and 495 true non-matches
    (29.89% true matches)
  Identified 653 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   618  (94.64%)
          2 :    32  (4.90%)
          3 :     2  (0.31%)
         18 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 653 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 474

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 705
  Number of unique weight vectors: 653

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (653, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 653 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 653 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 570 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 155 matches and 415 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (415, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 155 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 155 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 47 matches and 8 non-matches
    Purity of oracle classification:  0.855
    Entropy of oracle classification: 0.598
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(10)919_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 919), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)919_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 650
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 650 weight vectors
  Containing 198 true matches and 452 true non-matches
    (30.46% true matches)
  Identified 605 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   571  (94.38%)
          2 :    31  (5.12%)
          3 :     2  (0.33%)
         11 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 605 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 431

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 649
  Number of unique weight vectors: 605

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (605, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 605 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 605 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 30 matches and 53 non-matches
    Purity of oracle classification:  0.639
    Entropy of oracle classification: 0.944
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 522 weight vectors
  Based on 30 matches and 53 non-matches
  Classified 193 matches and 329 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (193, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)
    (329, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)

Current size of match and non-match training data sets: 30 / 53

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 193 weight vectors
- Estimated match proportion 0.361

Sample size for this cluster: 61

Farthest first selection of 61 weight vectors from 193 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 61 weight vectors
  The oracle will correctly classify 61 weight vectors and wrongly classify 0
  Classified 41 matches and 20 non-matches
    Purity of oracle classification:  0.672
    Entropy of oracle classification: 0.913
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  20
    Number of false non-matches: 0

Deleted 61 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)189_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 189), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)189_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 714
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 714 weight vectors
  Containing 218 true matches and 496 true non-matches
    (30.53% true matches)
  Identified 659 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   623  (94.54%)
          2 :    33  (5.01%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 659 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 475

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 713
  Number of unique weight vectors: 659

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (659, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 659 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 659 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 575 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 165 matches and 410 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (165, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (410, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 410 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 410 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.700, 0.429, 0.476, 0.647, 0.810] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.600, 0.857, 0.579, 0.286, 0.545] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 2 matches and 70 non-matches
    Purity of oracle classification:  0.972
    Entropy of oracle classification: 0.183
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)393_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 393), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)393_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1087
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1087 weight vectors
  Containing 214 true matches and 873 true non-matches
    (19.69% true matches)
  Identified 1033 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   998  (96.61%)
          2 :    32  (3.10%)
          3 :     2  (0.19%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1033 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1086
  Number of unique weight vectors: 1033

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1033, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1033 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1033 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 945 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 98 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)757_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 757), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)757_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1099
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1099 weight vectors
  Containing 227 true matches and 872 true non-matches
    (20.66% true matches)
  Identified 1042 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1005  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1042 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 851

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1098
  Number of unique weight vectors: 1042

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1042, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1042 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1042 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 954 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 845 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (845, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)155_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (15, 1 - acm diverg, 155), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)155_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 948
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 948 weight vectors
  Containing 166 true matches and 782 true non-matches
    (17.51% true matches)
  Identified 911 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.60%)
          2 :    28  (3.07%)
          3 :     2  (0.22%)
          6 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 911 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 149
     0.000 : 762

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 948
  Number of unique weight vectors: 911

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (911, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 911 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 911 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 824 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 117 matches and 707 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (117, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (707, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 117 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 117 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 42 matches and 7 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(10)706_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 706), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)706_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 622
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 622 weight vectors
  Containing 194 true matches and 428 true non-matches
    (31.19% true matches)
  Identified 570 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   534  (93.68%)
          2 :    33  (5.79%)
          3 :     2  (0.35%)
         16 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 570 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 407

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 621
  Number of unique weight vectors: 570

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (570, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 570 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 570 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 488 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 150 matches and 338 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (338, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 338 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 338 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.367, 0.667, 0.583, 0.625, 0.316] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.438, 0.500, 0.467, 0.529, 0.611] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.818, 0.727, 0.438, 0.375, 0.400] (False)
    [0.857, 0.000, 0.500, 0.389, 0.235, 0.045, 0.526] (False)
    [1.000, 0.000, 0.476, 0.179, 0.500, 0.412, 0.357] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.833, 0.571, 0.727, 0.647, 0.857] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.583, 0.875, 0.727, 0.833, 0.643] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 0 matches and 72 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)555_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 555), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)555_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 592
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 592 weight vectors
  Containing 212 true matches and 380 true non-matches
    (35.81% true matches)
  Identified 558 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   543  (97.31%)
          2 :    12  (2.15%)
          3 :     2  (0.36%)
         19 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 558 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 379

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 591
  Number of unique weight vectors: 558

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (558, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 558 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 558 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 26 matches and 56 non-matches
    Purity of oracle classification:  0.683
    Entropy of oracle classification: 0.901
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 476 weight vectors
  Based on 26 matches and 56 non-matches
  Classified 130 matches and 346 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.6829268292682927, 0.9011701959974223, 0.3170731707317073)
    (346, 0.6829268292682927, 0.9011701959974223, 0.3170731707317073)

Current size of match and non-match training data sets: 26 / 56

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 130 weight vectors
- Estimated match proportion 0.317

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 50 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.139
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)245_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 245), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)245_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)401_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 401), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)401_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 608
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 608 weight vectors
  Containing 187 true matches and 421 true non-matches
    (30.76% true matches)
  Identified 568 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   534  (94.01%)
          2 :    31  (5.46%)
          3 :     2  (0.35%)
          6 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 568 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 401

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 608
  Number of unique weight vectors: 568

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (568, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 568 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 568 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 486 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 148 matches and 338 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (338, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 148 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 47 matches and 7 non-matches
    Purity of oracle classification:  0.870
    Entropy of oracle classification: 0.556
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)275_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 275), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)275_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)593_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 593), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)593_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 731
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 731 weight vectors
  Containing 210 true matches and 521 true non-matches
    (28.73% true matches)
  Identified 698 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   684  (97.99%)
          2 :    11  (1.58%)
          3 :     2  (0.29%)
         19 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 698 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 520

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 730
  Number of unique weight vectors: 698

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (698, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 698 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 698 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 614 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 122 matches and 492 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (122, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (492, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 492 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 492 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.462, 0.609, 0.643, 0.706, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 12 matches and 62 non-matches
    Purity of oracle classification:  0.838
    Entropy of oracle classification: 0.639
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)583_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 583), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)583_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 275
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 275 weight vectors
  Containing 199 true matches and 76 true non-matches
    (72.36% true matches)
  Identified 242 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   228  (94.21%)
          2 :    11  (4.55%)
          3 :     2  (0.83%)
         19 :     1  (0.41%)

Identified 1 non-pure unique weight vectors (from 242 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 75

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 274
  Number of unique weight vectors: 242

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (242, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 242 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 69

Perform initial selection using "far" method

Farthest first selection of 69 weight vectors from 242 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 36 matches and 33 non-matches
    Purity of oracle classification:  0.522
    Entropy of oracle classification: 0.999
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  33
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 173 weight vectors
  Based on 36 matches and 33 non-matches
  Classified 138 matches and 35 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 69
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.5217391304347826, 0.9986359641585718, 0.5217391304347826)
    (35, 0.5217391304347826, 0.9986359641585718, 0.5217391304347826)

Current size of match and non-match training data sets: 36 / 33

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 138 weight vectors
- Estimated match proportion 0.522

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 138 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 49 matches and 8 non-matches
    Purity of oracle classification:  0.860
    Entropy of oracle classification: 0.585
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)279_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 279), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)279_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 146 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (538, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 538 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 538 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 9 matches and 65 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.534
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)332_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 332), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)332_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 716
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 716 weight vectors
  Containing 195 true matches and 521 true non-matches
    (27.23% true matches)
  Identified 692 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   675  (97.54%)
          2 :    14  (2.02%)
          3 :     2  (0.29%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 692 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 519

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 716
  Number of unique weight vectors: 692

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (692, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 692 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 692 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 608 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 134 matches and 474 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (134, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (474, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 474 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 474 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)420_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 420), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)420_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 548
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 548 weight vectors
  Containing 226 true matches and 322 true non-matches
    (41.24% true matches)
  Identified 509 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   490  (96.27%)
          2 :    16  (3.14%)
          3 :     2  (0.39%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 509 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 319

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 547
  Number of unique weight vectors: 509

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (509, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 509 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 509 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 33 matches and 48 non-matches
    Purity of oracle classification:  0.593
    Entropy of oracle classification: 0.975
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 428 weight vectors
  Based on 33 matches and 48 non-matches
  Classified 152 matches and 276 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)
    (276, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)

Current size of match and non-match training data sets: 33 / 48

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 152 weight vectors
- Estimated match proportion 0.407

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 152 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 53 matches and 5 non-matches
    Purity of oracle classification:  0.914
    Entropy of oracle classification: 0.424
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)495_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 495), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)495_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1021
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1021 weight vectors
  Containing 221 true matches and 800 true non-matches
    (21.65% true matches)
  Identified 967 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   930  (96.17%)
          2 :    34  (3.52%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 967 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 779

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1020
  Number of unique weight vectors: 967

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (967, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 967 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 967 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 32 matches and 55 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 880 weight vectors
  Based on 32 matches and 55 non-matches
  Classified 301 matches and 579 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (301, 0.632183908045977, 0.9489804585630242, 0.367816091954023)
    (579, 0.632183908045977, 0.9489804585630242, 0.367816091954023)

Current size of match and non-match training data sets: 32 / 55

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 301 weight vectors
- Estimated match proportion 0.368

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 301 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 46 matches and 23 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  23
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)748_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (15, 1 - acm diverg, 748), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)748_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 755
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 755 weight vectors
  Containing 170 true matches and 585 true non-matches
    (22.52% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   687  (95.68%)
          2 :    28  (3.90%)
          3 :     2  (0.28%)
          6 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 153
     0.000 : 565

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 755
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 25 matches and 59 non-matches
    Purity of oracle classification:  0.702
    Entropy of oracle classification: 0.878
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 25 matches and 59 non-matches
  Classified 91 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.7023809523809523, 0.8783609387702276, 0.2976190476190476)
    (543, 0.7023809523809523, 0.8783609387702276, 0.2976190476190476)

Current size of match and non-match training data sets: 25 / 59

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 543 weight vectors
- Estimated match proportion 0.298

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 543 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 16 matches and 54 non-matches
    Purity of oracle classification:  0.771
    Entropy of oracle classification: 0.776
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(15)935_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 935), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)935_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 683
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 683 weight vectors
  Containing 201 true matches and 482 true non-matches
    (29.43% true matches)
  Identified 638 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   604  (94.67%)
          2 :    31  (4.86%)
          3 :     2  (0.31%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 638 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 461

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 682
  Number of unique weight vectors: 638

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (638, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 638 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 638 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 26 matches and 57 non-matches
    Purity of oracle classification:  0.687
    Entropy of oracle classification: 0.897
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 555 weight vectors
  Based on 26 matches and 57 non-matches
  Classified 129 matches and 426 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (129, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)
    (426, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)

Current size of match and non-match training data sets: 26 / 57

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 129 weight vectors
- Estimated match proportion 0.313

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 129 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 49 matches and 2 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.239
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)8_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 8), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)8_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 526
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 526 weight vectors
  Containing 208 true matches and 318 true non-matches
    (39.54% true matches)
  Identified 497 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   480  (96.58%)
          2 :    14  (2.82%)
          3 :     2  (0.40%)
         12 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 497 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 315

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 525
  Number of unique weight vectors: 497

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (497, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 497 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 497 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 35 matches and 45 non-matches
    Purity of oracle classification:  0.562
    Entropy of oracle classification: 0.989
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 417 weight vectors
  Based on 35 matches and 45 non-matches
  Classified 142 matches and 275 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.5625, 0.9886994082884974, 0.4375)
    (275, 0.5625, 0.9886994082884974, 0.4375)

Current size of match and non-match training data sets: 35 / 45

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 275 weight vectors
- Estimated match proportion 0.438

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 275 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 5 matches and 65 non-matches
    Purity of oracle classification:  0.929
    Entropy of oracle classification: 0.371
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)296_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 296), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)296_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 731
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 731 weight vectors
  Containing 221 true matches and 510 true non-matches
    (30.23% true matches)
  Identified 695 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   676  (97.27%)
          2 :    16  (2.30%)
          3 :     2  (0.29%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 695 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 507

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 730
  Number of unique weight vectors: 695

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (695, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 695 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 695 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 33 matches and 51 non-matches
    Purity of oracle classification:  0.607
    Entropy of oracle classification: 0.967
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 611 weight vectors
  Based on 33 matches and 51 non-matches
  Classified 149 matches and 462 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)
    (462, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)

Current size of match and non-match training data sets: 33 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.97
- Size 149 weight vectors
- Estimated match proportion 0.393

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 53 matches and 4 non-matches
    Purity of oracle classification:  0.930
    Entropy of oracle classification: 0.367
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)751_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 751), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)751_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 423
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 423 weight vectors
  Containing 207 true matches and 216 true non-matches
    (48.94% true matches)
  Identified 390 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   376  (96.41%)
          2 :    11  (2.82%)
          3 :     2  (0.51%)
         19 :     1  (0.26%)

Identified 1 non-pure unique weight vectors (from 390 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 215

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 422
  Number of unique weight vectors: 390

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (390, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 390 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 390 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 40 matches and 37 non-matches
    Purity of oracle classification:  0.519
    Entropy of oracle classification: 0.999
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 313 weight vectors
  Based on 40 matches and 37 non-matches
  Classified 132 matches and 181 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (132, 0.5194805194805194, 0.9989047442823606, 0.5194805194805194)
    (181, 0.5194805194805194, 0.9989047442823606, 0.5194805194805194)

Current size of match and non-match training data sets: 40 / 37

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 132 weight vectors
- Estimated match proportion 0.519

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 132 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 50 matches and 6 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.491
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)554_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 554), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)554_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 328
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 328 weight vectors
  Containing 151 true matches and 177 true non-matches
    (46.04% true matches)
  Identified 310 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   300  (96.77%)
          2 :     7  (2.26%)
          3 :     2  (0.65%)
          8 :     1  (0.32%)

Identified 1 non-pure unique weight vectors (from 310 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 135
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 174

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 320
  Number of unique weight vectors: 309

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (309, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 309 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 73

Perform initial selection using "far" method

Farthest first selection of 73 weight vectors from 309 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 32 matches and 41 non-matches
    Purity of oracle classification:  0.562
    Entropy of oracle classification: 0.989
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  41
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 236 weight vectors
  Based on 32 matches and 41 non-matches
  Classified 106 matches and 130 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 73
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.5616438356164384, 0.9890076795739704, 0.4383561643835616)
    (130, 0.5616438356164384, 0.9890076795739704, 0.4383561643835616)

Current size of match and non-match training data sets: 32 / 41

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 130 weight vectors
- Estimated match proportion 0.438

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.800, 0.636, 0.563, 0.545, 0.722] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 3 matches and 52 non-matches
    Purity of oracle classification:  0.945
    Entropy of oracle classification: 0.305
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)643_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 643), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)643_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 521
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 521 weight vectors
  Containing 206 true matches and 315 true non-matches
    (39.54% true matches)
  Identified 492 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   475  (96.54%)
          2 :    14  (2.85%)
          3 :     2  (0.41%)
         12 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 492 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 312

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 520
  Number of unique weight vectors: 492

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (492, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 492 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 492 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 33 matches and 47 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.978
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 412 weight vectors
  Based on 33 matches and 47 non-matches
  Classified 142 matches and 270 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.5875, 0.9777945702913884, 0.4125)
    (270, 0.5875, 0.9777945702913884, 0.4125)

Current size of match and non-match training data sets: 33 / 47

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 142 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 50 matches and 6 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.491
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)218_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 218), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)218_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 159 matches and 780 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (780, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 780 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 78

Farthest first selection of 78 weight vectors from 780 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.667, 0.500, 0.647, 0.556, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.750, 0.429, 0.526, 0.500, 0.846] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.462, 0.889, 0.455, 0.211, 0.375] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.412, 0.318, 0.421] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.233, 0.545, 0.714, 0.455, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 2 matches and 76 non-matches
    Purity of oracle classification:  0.974
    Entropy of oracle classification: 0.172
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)801_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 801), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)801_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 689
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 689 weight vectors
  Containing 219 true matches and 470 true non-matches
    (31.79% true matches)
  Identified 656 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   640  (97.56%)
          2 :    13  (1.98%)
          3 :     2  (0.30%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 656 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 469

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 688
  Number of unique weight vectors: 656

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (656, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 656 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 656 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 572 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 128 matches and 444 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (128, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (444, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 128 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 128 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)1000_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 1000), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)1000_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1003
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1003 weight vectors
  Containing 209 true matches and 794 true non-matches
    (20.84% true matches)
  Identified 949 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   912  (96.10%)
          2 :    34  (3.58%)
          3 :     2  (0.21%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 949 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 773

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1002
  Number of unique weight vectors: 949

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (949, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 949 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 949 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 862 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 286 matches and 576 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (286, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (576, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 576 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 576 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)526_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 526), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)526_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(20)412_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 412), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)412_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 788
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 788 weight vectors
  Containing 208 true matches and 580 true non-matches
    (26.40% true matches)
  Identified 759 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   742  (97.76%)
          2 :    14  (1.84%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 759 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 577

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 787
  Number of unique weight vectors: 759

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (759, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 759 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 759 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 674 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 144 matches and 530 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (530, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 530 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 530 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 5 matches and 70 non-matches
    Purity of oracle classification:  0.933
    Entropy of oracle classification: 0.353
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)491_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990099
recall                 0.334448
f-measure                   0.5
da                          101
dm                            0
ndm                           0
tp                          100
fp                            1
tn                  4.76529e+07
fn                          199
Name: (10, 1 - acm diverg, 491), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)491_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 452
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 452 weight vectors
  Containing 161 true matches and 291 true non-matches
    (35.62% true matches)
  Identified 431 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   420  (97.45%)
          2 :     8  (1.86%)
          3 :     2  (0.46%)
         10 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 431 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 142
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 288

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 451
  Number of unique weight vectors: 431

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (431, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 431 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 431 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 38 matches and 40 non-matches
    Purity of oracle classification:  0.513
    Entropy of oracle classification: 1.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  40
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 353 weight vectors
  Based on 38 matches and 40 non-matches
  Classified 272 matches and 81 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (272, 0.5128205128205128, 0.9995256892936493, 0.48717948717948717)
    (81, 0.5128205128205128, 0.9995256892936493, 0.48717948717948717)

Current size of match and non-match training data sets: 38 / 40

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 81 weight vectors
- Estimated match proportion 0.487

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 81 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 0 matches and 44 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(15)219_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 219), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)219_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 401
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 401 weight vectors
  Containing 209 true matches and 192 true non-matches
    (52.12% true matches)
  Identified 370 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   356  (96.22%)
          2 :    11  (2.97%)
          3 :     2  (0.54%)
         17 :     1  (0.27%)

Identified 1 non-pure unique weight vectors (from 370 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 191

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 400
  Number of unique weight vectors: 370

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (370, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 370 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 370 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 39 matches and 37 non-matches
    Purity of oracle classification:  0.513
    Entropy of oracle classification: 1.000
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 294 weight vectors
  Based on 39 matches and 37 non-matches
  Classified 137 matches and 157 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.5131578947368421, 0.9995003941817588, 0.5131578947368421)
    (157, 0.5131578947368421, 0.9995003941817588, 0.5131578947368421)

Current size of match and non-match training data sets: 39 / 37

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 157 weight vectors
- Estimated match proportion 0.513

Sample size for this cluster: 60

Farthest first selection of 60 weight vectors from 157 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.902, 1.000, 0.182, 0.071, 0.182, 0.222, 0.190] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 1.000, 0.224, 0.219, 0.140, 0.209, 0.161] (False)
    [0.663, 1.000, 0.132, 0.143, 0.241, 0.174, 0.167] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.747, 1.000, 0.231, 0.167, 0.107, 0.222, 0.125] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 60 weight vectors
  The oracle will correctly classify 60 weight vectors and wrongly classify 0
  Classified 8 matches and 52 non-matches
    Purity of oracle classification:  0.867
    Entropy of oracle classification: 0.567
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 60 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)995_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 995), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)995_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 382
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 382 weight vectors
  Containing 212 true matches and 170 true non-matches
    (55.50% true matches)
  Identified 346 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   329  (95.09%)
          2 :    14  (4.05%)
          3 :     2  (0.58%)
         19 :     1  (0.29%)

Identified 1 non-pure unique weight vectors (from 346 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 167

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 381
  Number of unique weight vectors: 346

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (346, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 346 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 346 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 29 matches and 46 non-matches
    Purity of oracle classification:  0.613
    Entropy of oracle classification: 0.963
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 271 weight vectors
  Based on 29 matches and 46 non-matches
  Classified 148 matches and 123 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6133333333333333, 0.9626147059982517, 0.38666666666666666)
    (123, 0.6133333333333333, 0.9626147059982517, 0.38666666666666666)

Current size of match and non-match training data sets: 29 / 46

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 123 weight vectors
- Estimated match proportion 0.387

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 4 matches and 49 non-matches
    Purity of oracle classification:  0.925
    Entropy of oracle classification: 0.386
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)538_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 538), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)538_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 361
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 361 weight vectors
  Containing 197 true matches and 164 true non-matches
    (54.57% true matches)
  Identified 332 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   319  (96.08%)
          2 :    10  (3.01%)
          3 :     2  (0.60%)
         16 :     1  (0.30%)

Identified 1 non-pure unique weight vectors (from 332 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 163

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 360
  Number of unique weight vectors: 332

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (332, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 332 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 332 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 47 matches and 27 non-matches
    Purity of oracle classification:  0.635
    Entropy of oracle classification: 0.947
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  27
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 258 weight vectors
  Based on 47 matches and 27 non-matches
  Classified 258 matches and 0 non-matches

63.0
Analisando o arquivo: diverg(20)357_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 357), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)357_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1087
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1087 weight vectors
  Containing 214 true matches and 873 true non-matches
    (19.69% true matches)
  Identified 1033 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   998  (96.61%)
          2 :    32  (3.10%)
          3 :     2  (0.19%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1033 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1086
  Number of unique weight vectors: 1033

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1033, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1033 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1033 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 945 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 98 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 98 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 98 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 42 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)851_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 851), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)851_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 626
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 626 weight vectors
  Containing 213 true matches and 413 true non-matches
    (34.03% true matches)
  Identified 574 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   538  (93.73%)
          2 :    33  (5.75%)
          3 :     2  (0.35%)
         16 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 574 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 392

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 625
  Number of unique weight vectors: 574

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (574, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 574 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 574 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 492 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 179 matches and 313 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (313, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 179 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 59

Farthest first selection of 59 weight vectors from 179 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.942, 1.000, 0.156, 0.172, 0.189, 0.148, 0.133] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 59 weight vectors
  The oracle will correctly classify 59 weight vectors and wrongly classify 0
  Classified 45 matches and 14 non-matches
    Purity of oracle classification:  0.763
    Entropy of oracle classification: 0.791
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  14
    Number of false non-matches: 0

Deleted 59 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)602_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (10, 1 - acm diverg, 602), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)602_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 744
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 744 weight vectors
  Containing 169 true matches and 575 true non-matches
    (22.72% true matches)
  Identified 707 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   676  (95.62%)
          2 :    28  (3.96%)
          3 :     2  (0.28%)
          6 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 707 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.000 : 555

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 744
  Number of unique weight vectors: 707

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (707, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 707 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 707 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 623 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 119 matches and 504 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (119, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (504, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 119 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 119 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 45 matches and 4 non-matches
    Purity of oracle classification:  0.918
    Entropy of oracle classification: 0.408
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(10)592_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.982759
recall                 0.190635
f-measure              0.319328
da                           58
dm                            0
ndm                           0
tp                           57
fp                            1
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 592), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)592_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 683
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 683 weight vectors
  Containing 202 true matches and 481 true non-matches
    (29.58% true matches)
  Identified 632 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   598  (94.62%)
          2 :    31  (4.91%)
          3 :     2  (0.32%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 632 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 460

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 682
  Number of unique weight vectors: 632

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (632, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 632 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 632 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 549 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 143 matches and 406 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (406, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 143 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 143 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)649_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 649), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)649_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1073
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1073 weight vectors
  Containing 226 true matches and 847 true non-matches
    (21.06% true matches)
  Identified 1016 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   979  (96.36%)
          2 :    34  (3.35%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1016 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 826

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1072
  Number of unique weight vectors: 1016

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1016, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1016 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1016 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 929 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 332 matches and 597 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (332, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (597, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 597 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 597 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.692, 0.583, 0.500, 0.750, 0.731] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)262_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 262), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)262_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 177 matches and 761 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (177, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (761, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 177 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 177 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 44 matches and 14 non-matches
    Purity of oracle classification:  0.759
    Entropy of oracle classification: 0.797
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  14
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)291_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 291), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)291_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 946
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 946 weight vectors
  Containing 219 true matches and 727 true non-matches
    (23.15% true matches)
  Identified 891 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   855  (95.96%)
          2 :    33  (3.70%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 891 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 706

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 945
  Number of unique weight vectors: 891

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (891, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 891 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 891 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 805 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 130 matches and 675 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (675, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 675 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 675 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 11 matches and 58 non-matches
    Purity of oracle classification:  0.841
    Entropy of oracle classification: 0.633
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)551_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 551), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)551_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 508
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 508 weight vectors
  Containing 207 true matches and 301 true non-matches
    (40.75% true matches)
  Identified 479 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   462  (96.45%)
          2 :    14  (2.92%)
          3 :     2  (0.42%)
         12 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 479 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 298

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 507
  Number of unique weight vectors: 479

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (479, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 479 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 479 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 36 matches and 44 non-matches
    Purity of oracle classification:  0.550
    Entropy of oracle classification: 0.993
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 399 weight vectors
  Based on 36 matches and 44 non-matches
  Classified 239 matches and 160 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (239, 0.55, 0.9927744539878084, 0.45)
    (160, 0.55, 0.9927744539878084, 0.45)

Current size of match and non-match training data sets: 36 / 44

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 160 weight vectors
- Estimated match proportion 0.450

Sample size for this cluster: 60

Farthest first selection of 60 weight vectors from 160 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [0.530, 1.000, 0.159, 0.086, 0.182, 0.159, 0.163] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.820, 1.000, 0.190, 0.273, 0.163, 0.122, 0.143] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.818, 0.727, 0.438, 0.375, 0.400] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 60 weight vectors
  The oracle will correctly classify 60 weight vectors and wrongly classify 0
  Classified 1 matches and 59 non-matches
    Purity of oracle classification:  0.983
    Entropy of oracle classification: 0.122
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 60 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)256_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 256), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)256_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 585
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 585 weight vectors
  Containing 208 true matches and 377 true non-matches
    (35.56% true matches)
  Identified 552 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   535  (96.92%)
          2 :    14  (2.54%)
          3 :     2  (0.36%)
         16 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 552 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 374

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 584
  Number of unique weight vectors: 552

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (552, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 552 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 552 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 470 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 153 matches and 317 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (317, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 153 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)531_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 531), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)531_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 528
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 528 weight vectors
  Containing 208 true matches and 320 true non-matches
    (39.39% true matches)
  Identified 499 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   482  (96.59%)
          2 :    14  (2.81%)
          3 :     2  (0.40%)
         12 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 499 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 317

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 527
  Number of unique weight vectors: 499

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (499, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 499 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 499 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 34 matches and 46 non-matches
    Purity of oracle classification:  0.575
    Entropy of oracle classification: 0.984
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 419 weight vectors
  Based on 34 matches and 46 non-matches
  Classified 143 matches and 276 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.575, 0.9837082626231857, 0.425)
    (276, 0.575, 0.9837082626231857, 0.425)

Current size of match and non-match training data sets: 34 / 46

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.98
- Size 276 weight vectors
- Estimated match proportion 0.425

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 276 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 5 matches and 65 non-matches
    Purity of oracle classification:  0.929
    Entropy of oracle classification: 0.371
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)570_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 570), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)570_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 652
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 652 weight vectors
  Containing 190 true matches and 462 true non-matches
    (29.14% true matches)
  Identified 612 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   578  (94.44%)
          2 :    31  (5.07%)
          3 :     2  (0.33%)
          6 :     1  (0.16%)

Identified 0 non-pure unique weight vectors (from 612 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.000 : 442

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 652
  Number of unique weight vectors: 612

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (612, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 612 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 612 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 529 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 142 matches and 387 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (387, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 142 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 48 matches and 5 non-matches
    Purity of oracle classification:  0.906
    Entropy of oracle classification: 0.451
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(15)761_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 761), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)761_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 548
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 548 weight vectors
  Containing 226 true matches and 322 true non-matches
    (41.24% true matches)
  Identified 509 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   490  (96.27%)
          2 :    16  (3.14%)
          3 :     2  (0.39%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 509 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 319

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 547
  Number of unique weight vectors: 509

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (509, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 509 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 509 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 33 matches and 48 non-matches
    Purity of oracle classification:  0.593
    Entropy of oracle classification: 0.975
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 428 weight vectors
  Based on 33 matches and 48 non-matches
  Classified 152 matches and 276 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)
    (276, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)

Current size of match and non-match training data sets: 33 / 48

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 152 weight vectors
- Estimated match proportion 0.407

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 152 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 53 matches and 5 non-matches
    Purity of oracle classification:  0.914
    Entropy of oracle classification: 0.424
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)515_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 515), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)515_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 244
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 244 weight vectors
  Containing 205 true matches and 39 true non-matches
    (84.02% true matches)
  Identified 214 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   200  (93.46%)
          2 :    11  (5.14%)
          3 :     2  (0.93%)
         16 :     1  (0.47%)

Identified 1 non-pure unique weight vectors (from 214 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 38

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 243
  Number of unique weight vectors: 214

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (214, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 214 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 66

Perform initial selection using "far" method

Farthest first selection of 66 weight vectors from 214 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 40 matches and 26 non-matches
    Purity of oracle classification:  0.606
    Entropy of oracle classification: 0.967
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 148 weight vectors
  Based on 40 matches and 26 non-matches
  Classified 148 matches and 0 non-matches

43.0
Analisando o arquivo: diverg(15)341_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 341), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)341_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 445
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 445 weight vectors
  Containing 196 true matches and 249 true non-matches
    (44.04% true matches)
  Identified 421 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   404  (95.96%)
          2 :    14  (3.33%)
          3 :     2  (0.48%)
          7 :     1  (0.24%)

Identified 0 non-pure unique weight vectors (from 421 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 247

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 445
  Number of unique weight vectors: 421

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (421, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 421 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 421 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 35 matches and 43 non-matches
    Purity of oracle classification:  0.551
    Entropy of oracle classification: 0.992
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 343 weight vectors
  Based on 35 matches and 43 non-matches
  Classified 138 matches and 205 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.5512820512820513, 0.9923985003332222, 0.44871794871794873)
    (205, 0.5512820512820513, 0.9923985003332222, 0.44871794871794873)

Current size of match and non-match training data sets: 35 / 43

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 138 weight vectors
- Estimated match proportion 0.449

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 138 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)182_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 182), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)182_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 515
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 515 weight vectors
  Containing 212 true matches and 303 true non-matches
    (41.17% true matches)
  Identified 479 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   462  (96.45%)
          2 :    14  (2.92%)
          3 :     2  (0.42%)
         19 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 479 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 300

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 514
  Number of unique weight vectors: 479

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (479, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 479 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 479 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 36 matches and 44 non-matches
    Purity of oracle classification:  0.550
    Entropy of oracle classification: 0.993
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 399 weight vectors
  Based on 36 matches and 44 non-matches
  Classified 162 matches and 237 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.55, 0.9927744539878084, 0.45)
    (237, 0.55, 0.9927744539878084, 0.45)

Current size of match and non-match training data sets: 36 / 44

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 162 weight vectors
- Estimated match proportion 0.450

Sample size for this cluster: 60

Farthest first selection of 60 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 60 weight vectors
  The oracle will correctly classify 60 weight vectors and wrongly classify 0
  Classified 46 matches and 14 non-matches
    Purity of oracle classification:  0.767
    Entropy of oracle classification: 0.784
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  14
    Number of false non-matches: 0

Deleted 60 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)808_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 808), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)808_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 695
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 695 weight vectors
  Containing 212 true matches and 483 true non-matches
    (30.50% true matches)
  Identified 660 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   645  (97.73%)
          2 :    12  (1.82%)
          3 :     2  (0.30%)
         20 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 660 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 482

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 694
  Number of unique weight vectors: 660

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (660, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 660 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 660 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 32 matches and 52 non-matches
    Purity of oracle classification:  0.619
    Entropy of oracle classification: 0.959
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 576 weight vectors
  Based on 32 matches and 52 non-matches
  Classified 310 matches and 266 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (310, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)
    (266, 0.6190476190476191, 0.9587118829771318, 0.38095238095238093)

Current size of match and non-match training data sets: 32 / 52

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 266 weight vectors
- Estimated match proportion 0.381

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 266 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [1.000, 0.000, 0.786, 0.833, 0.545, 0.478, 0.346] (False)
    [0.533, 0.000, 0.577, 0.783, 0.429, 0.615, 0.478] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.462, 0.667, 0.600, 0.389, 0.615] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.500, 0.739, 0.824, 0.591, 0.550] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.815, 0.643, 0.800, 0.750, 0.429] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.571, 0.867, 0.471, 0.583, 0.643] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 0.421, 0.625, 0.435, 0.800, 0.643] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 0 matches and 68 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)80_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.982143
recall                 0.183946
f-measure              0.309859
da                           56
dm                            0
ndm                           0
tp                           55
fp                            1
tn                  4.76529e+07
fn                          244
Name: (10, 1 - acm diverg, 80), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)80_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 919
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 919 weight vectors
  Containing 199 true matches and 720 true non-matches
    (21.65% true matches)
  Identified 868 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   834  (96.08%)
          2 :    31  (3.57%)
          3 :     2  (0.23%)
         17 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 868 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 699

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 918
  Number of unique weight vectors: 868

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (868, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 868 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 868 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 31 matches and 55 non-matches
    Purity of oracle classification:  0.640
    Entropy of oracle classification: 0.943
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 782 weight vectors
  Based on 31 matches and 55 non-matches
  Classified 187 matches and 595 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (187, 0.6395348837209303, 0.9430685934712908, 0.36046511627906974)
    (595, 0.6395348837209303, 0.9430685934712908, 0.36046511627906974)

Current size of match and non-match training data sets: 31 / 55

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 595 weight vectors
- Estimated match proportion 0.360

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 595 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.474, 0.692, 0.826, 0.484, 0.545] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.692, 0.583, 0.500, 0.750, 0.731] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

56.0
Analisando o arquivo: diverg(15)374_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 374), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)374_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1061
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1061 weight vectors
  Containing 188 true matches and 873 true non-matches
    (17.72% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   988  (96.96%)
          2 :    28  (2.75%)
          3 :     2  (0.20%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1060
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 75 matches and 857 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (75, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (857, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 75 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 38

Farthest first selection of 38 weight vectors from 75 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)

Perform oracle with 100.00 accuracy on 38 weight vectors
  The oracle will correctly classify 38 weight vectors and wrongly classify 0
  Classified 38 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 38 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)539_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 539), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)539_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 831
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 831 weight vectors
  Containing 227 true matches and 604 true non-matches
    (27.32% true matches)
  Identified 774 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   737  (95.22%)
          2 :    34  (4.39%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 774 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 583

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 830
  Number of unique weight vectors: 774

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (774, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 774 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 774 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 689 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 151 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (538, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 538 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 538 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 9 matches and 64 non-matches
    Purity of oracle classification:  0.877
    Entropy of oracle classification: 0.539
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)582_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 582), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)582_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 854
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 854 weight vectors
  Containing 226 true matches and 628 true non-matches
    (26.46% true matches)
  Identified 797 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   760  (95.36%)
          2 :    34  (4.27%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 797 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 607

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 853
  Number of unique weight vectors: 797

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (797, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 797 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 797 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 712 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 148 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (564, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 564 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 564 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 10 matches and 62 non-matches
    Purity of oracle classification:  0.861
    Entropy of oracle classification: 0.581
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)600_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 600), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)600_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1100
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1100 weight vectors
  Containing 227 true matches and 873 true non-matches
    (20.64% true matches)
  Identified 1043 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1006  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1043 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1099
  Number of unique weight vectors: 1043

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1043, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1043 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1043 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 955 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 846 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 846 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)169_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 169), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)169_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 179 matches and 760 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (760, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 760 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)819_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 819), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)819_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1024
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1024 weight vectors
  Containing 198 true matches and 826 true non-matches
    (19.34% true matches)
  Identified 982 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   947  (96.44%)
          2 :    32  (3.26%)
          3 :     2  (0.20%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 982 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 806

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1024
  Number of unique weight vectors: 982

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (982, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 982 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 982 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 895 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 93 matches and 802 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (93, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (802, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 93 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 93 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)551_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 551), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)551_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 732
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 732 weight vectors
  Containing 219 true matches and 513 true non-matches
    (29.92% true matches)
  Identified 677 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   641  (94.68%)
          2 :    33  (4.87%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 677 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 492

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 731
  Number of unique weight vectors: 677

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (677, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 677 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 677 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 593 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 148 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (445, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 445 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 445 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 8 matches and 62 non-matches
    Purity of oracle classification:  0.886
    Entropy of oracle classification: 0.513
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)799_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 799), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)799_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 454
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 454 weight vectors
  Containing 205 true matches and 249 true non-matches
    (45.15% true matches)
  Identified 428 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   414  (96.73%)
          2 :    11  (2.57%)
          3 :     2  (0.47%)
         12 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 428 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 248

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 453
  Number of unique weight vectors: 428

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (428, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 428 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 428 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 33 matches and 45 non-matches
    Purity of oracle classification:  0.577
    Entropy of oracle classification: 0.983
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 350 weight vectors
  Based on 33 matches and 45 non-matches
  Classified 141 matches and 209 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.5769230769230769, 0.9828586897127056, 0.4230769230769231)
    (209, 0.5769230769230769, 0.9828586897127056, 0.4230769230769231)

Current size of match and non-match training data sets: 33 / 45

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 141 weight vectors
- Estimated match proportion 0.423

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.933, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)280_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 280), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)280_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1034
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1034 weight vectors
  Containing 223 true matches and 811 true non-matches
    (21.57% true matches)
  Identified 980 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   943  (96.22%)
          2 :    34  (3.47%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 980 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 790

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1033
  Number of unique weight vectors: 980

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (980, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 980 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 980 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 893 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 160 matches and 733 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (160, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (733, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 733 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 733 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 3 matches and 74 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.238
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)807_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (15, 1 - acm diverg, 807), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)807_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 907
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 907 weight vectors
  Containing 157 true matches and 750 true non-matches
    (17.31% true matches)
  Identified 871 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   843  (96.79%)
          2 :    25  (2.87%)
          3 :     2  (0.23%)
          8 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 871 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 141
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 729

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 899
  Number of unique weight vectors: 870

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (870, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 870 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 870 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 23 matches and 63 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.838
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 784 weight vectors
  Based on 23 matches and 63 non-matches
  Classified 71 matches and 713 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (71, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)
    (713, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)

Current size of match and non-match training data sets: 23 / 63

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.84
- Size 713 weight vectors
- Estimated match proportion 0.267

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 713 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(15)282_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 282), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)282_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 441
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 441 weight vectors
  Containing 196 true matches and 245 true non-matches
    (44.44% true matches)
  Identified 417 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   400  (95.92%)
          2 :    14  (3.36%)
          3 :     2  (0.48%)
          7 :     1  (0.24%)

Identified 0 non-pure unique weight vectors (from 417 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 243

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 441
  Number of unique weight vectors: 417

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (417, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 417 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 417 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 38 matches and 40 non-matches
    Purity of oracle classification:  0.513
    Entropy of oracle classification: 1.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  40
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 339 weight vectors
  Based on 38 matches and 40 non-matches
  Classified 271 matches and 68 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (271, 0.5128205128205128, 0.9995256892936493, 0.48717948717948717)
    (68, 0.5128205128205128, 0.9995256892936493, 0.48717948717948717)

Current size of match and non-match training data sets: 38 / 40

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 271 weight vectors
- Estimated match proportion 0.487

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 271 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [0.817, 1.000, 0.182, 0.115, 0.154, 0.194, 0.111] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.821, 1.000, 0.275, 0.297, 0.227, 0.255, 0.152] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 42 matches and 29 non-matches
    Purity of oracle classification:  0.592
    Entropy of oracle classification: 0.976
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  29
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)164_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 164), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)164_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1064
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1064 weight vectors
  Containing 219 true matches and 845 true non-matches
    (20.58% true matches)
  Identified 1008 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   972  (96.43%)
          2 :    33  (3.27%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1008 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 824

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1063
  Number of unique weight vectors: 1008

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1008, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1008 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1008 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 921 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 325 matches and 596 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (325, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (596, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 596 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 596 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.474, 0.692, 0.826, 0.484, 0.545] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.692, 0.583, 0.500, 0.750, 0.731] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)112_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 112), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)112_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 961
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 961 weight vectors
  Containing 217 true matches and 744 true non-matches
    (22.58% true matches)
  Identified 906 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   870  (96.03%)
          2 :    33  (3.64%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 906 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 723

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 960
  Number of unique weight vectors: 906

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (906, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 906 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 906 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 819 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 135 matches and 684 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (684, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 135 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 135 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 50 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.139
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)230_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 230), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)230_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 649
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 649 weight vectors
  Containing 199 true matches and 450 true non-matches
    (30.66% true matches)
  Identified 622 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   606  (97.43%)
          2 :    13  (2.09%)
          3 :     2  (0.32%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 622 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 447

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 648
  Number of unique weight vectors: 622

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (622, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 622 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 622 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 539 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 127 matches and 412 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (127, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (412, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 127 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 127 vectors
  The selected farthest weight vectors are:
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 48 matches and 3 non-matches
    Purity of oracle classification:  0.941
    Entropy of oracle classification: 0.323
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)406_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (20, 1 - acm diverg, 406), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)406_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 201 true matches and 752 true non-matches
    (21.09% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.26%)
          2 :    31  (3.41%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 110 matches and 711 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (110, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (711, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 110 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 110 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 45 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.151
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)183_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 183), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)183_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 936
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 936 weight vectors
  Containing 217 true matches and 719 true non-matches
    (23.18% true matches)
  Identified 881 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   845  (95.91%)
          2 :    33  (3.75%)
          3 :     2  (0.23%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 881 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 698

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 935
  Number of unique weight vectors: 881

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (881, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 881 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 881 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 795 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 133 matches and 662 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (662, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 662 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 662 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 13 matches and 58 non-matches
    Purity of oracle classification:  0.817
    Entropy of oracle classification: 0.687
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)698_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 698), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)698_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 600
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 600 weight vectors
  Containing 172 true matches and 428 true non-matches
    (28.67% true matches)
  Identified 580 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   569  (98.10%)
          2 :     8  (1.38%)
          3 :     2  (0.34%)
          9 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 580 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 154
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 425

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 591
  Number of unique weight vectors: 579

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (579, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 579 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 579 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.538, 0.789, 0.353, 0.545, 0.550] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 31 matches and 51 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 497 weight vectors
  Based on 31 matches and 51 non-matches
  Classified 127 matches and 370 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (127, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)
    (370, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)

Current size of match and non-match training data sets: 31 / 51

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 127 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 127 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 45 matches and 8 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(15)65_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 65), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)65_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 732
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 732 weight vectors
  Containing 184 true matches and 548 true non-matches
    (25.14% true matches)
  Identified 708 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   695  (98.16%)
          2 :    10  (1.41%)
          3 :     2  (0.28%)
         11 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 708 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 545

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 731
  Number of unique weight vectors: 708

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (708, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 708 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 708 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 34 matches and 50 non-matches
    Purity of oracle classification:  0.595
    Entropy of oracle classification: 0.974
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 624 weight vectors
  Based on 34 matches and 50 non-matches
  Classified 293 matches and 331 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (293, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)
    (331, 0.5952380952380952, 0.9736680645496201, 0.40476190476190477)

Current size of match and non-match training data sets: 34 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 293 weight vectors
- Estimated match proportion 0.405

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 293 vectors
  The selected farthest weight vectors are:
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.890, 1.000, 0.281, 0.136, 0.183, 0.250, 0.163] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 42 matches and 28 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(10)513_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990826
recall                 0.361204
f-measure              0.529412
da                          109
dm                            0
ndm                           0
tp                          108
fp                            1
tn                  4.76529e+07
fn                          191
Name: (10, 1 - acm diverg, 513), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)513_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 886
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 886 weight vectors
  Containing 148 true matches and 738 true non-matches
    (16.70% true matches)
  Identified 850 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   822  (96.71%)
          2 :    25  (2.94%)
          3 :     2  (0.24%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 850 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 132
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 717

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 878
  Number of unique weight vectors: 849

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (849, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 849 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 849 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 763 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 43 matches and 720 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (43, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (720, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 43 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 28

Farthest first selection of 28 weight vectors from 43 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 1.000, 1.000, 0.952, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)

Perform oracle with 100.00 accuracy on 28 weight vectors
  The oracle will correctly classify 28 weight vectors and wrongly classify 0
  Classified 28 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 28 weight vectors (classified by oracle) from cluster

Cluster is pure enough and not too large, add its 43 weight vectors to:
  Match training set

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 3: Queue length: 1
  Number of manual oracle classifications performed: 114
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (720, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 67 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 720 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 720 vectors
  The selected farthest weight vectors are:
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 15 matches and 55 non-matches
    Purity of oracle classification:  0.786
    Entropy of oracle classification: 0.750
    Number of true matches:      15
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

109.0
Analisando o arquivo: diverg(15)571_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 571), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)571_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 793
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 793 weight vectors
  Containing 223 true matches and 570 true non-matches
    (28.12% true matches)
  Identified 739 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   702  (94.99%)
          2 :    34  (4.60%)
          3 :     2  (0.27%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 739 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 549

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 792
  Number of unique weight vectors: 739

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (739, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 739 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 739 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 654 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 157 matches and 497 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (497, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 497 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 497 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.857, 0.444, 0.556, 0.235, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 4 matches and 70 non-matches
    Purity of oracle classification:  0.946
    Entropy of oracle classification: 0.303
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)796_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.197324
f-measure              0.329609
da                           59
dm                            0
ndm                           0
tp                           59
fp                            0
tn                  4.76529e+07
fn                          240
Name: (10, 1 - acm diverg, 796), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)796_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 693
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 693 weight vectors
  Containing 198 true matches and 495 true non-matches
    (28.57% true matches)
  Identified 648 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   614  (94.75%)
          2 :    31  (4.78%)
          3 :     2  (0.31%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 648 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 474

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 692
  Number of unique weight vectors: 648

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (648, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 648 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 648 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 565 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 155 matches and 410 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (410, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 155 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 155 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 44 matches and 11 non-matches
    Purity of oracle classification:  0.800
    Entropy of oracle classification: 0.722
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

59.0
Analisando o arquivo: diverg(20)397_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 397), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)397_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 810
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 810 weight vectors
  Containing 223 true matches and 587 true non-matches
    (27.53% true matches)
  Identified 756 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   719  (95.11%)
          2 :    34  (4.50%)
          3 :     2  (0.26%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 756 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 566

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 809
  Number of unique weight vectors: 756

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (756, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 756 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 671 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 94 matches and 577 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (577, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 577 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 577 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 20 matches and 53 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      20
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)880_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 880), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)880_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1100
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1100 weight vectors
  Containing 227 true matches and 873 true non-matches
    (20.64% true matches)
  Identified 1043 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1006  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1043 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1099
  Number of unique weight vectors: 1043

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1043, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1043 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1043 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 955 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)237_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 237), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)237_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 644
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 644 weight vectors
  Containing 214 true matches and 430 true non-matches
    (33.23% true matches)
  Identified 611 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   595  (97.38%)
          2 :    13  (2.13%)
          3 :     2  (0.33%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 611 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 429

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 643
  Number of unique weight vectors: 611

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (611, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 611 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 611 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.364, 0.619, 0.471, 0.600, 0.533] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 34 matches and 49 non-matches
    Purity of oracle classification:  0.590
    Entropy of oracle classification: 0.976
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 528 weight vectors
  Based on 34 matches and 49 non-matches
  Classified 284 matches and 244 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (284, 0.5903614457831325, 0.9763102872004581, 0.40963855421686746)
    (244, 0.5903614457831325, 0.9763102872004581, 0.40963855421686746)

Current size of match and non-match training data sets: 34 / 49

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 244 weight vectors
- Estimated match proportion 0.410

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 244 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 0.667, 0.412, 0.857] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.667, 0.722, 0.353, 0.545, 0.800] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.813, 0.619, 0.333, 0.500, 0.571] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.615, 0.826, 0.286, 0.857, 0.643] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [1.000, 0.000, 0.591, 0.762, 0.647, 0.636, 0.550] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.818, 0.762, 0.714, 0.500, 0.400] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.790, 0.000, 0.636, 0.619, 0.429, 0.450, 0.609] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.950, 0.000, 0.619, 0.800, 0.478, 0.280, 0.625] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [0.611, 0.000, 0.800, 0.684, 0.500, 0.778, 0.609] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 0 matches and 67 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)817_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 817), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)817_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(15)886_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 886), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)886_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 695
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 695 weight vectors
  Containing 200 true matches and 495 true non-matches
    (28.78% true matches)
  Identified 650 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   616  (94.77%)
          2 :    31  (4.77%)
          3 :     2  (0.31%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 650 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 474

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 694
  Number of unique weight vectors: 650

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (650, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 650 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 650 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 567 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 156 matches and 411 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (411, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 411 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 411 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [1.000, 0.000, 0.700, 0.429, 0.476, 0.647, 0.810] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 1 matches and 70 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.107
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)907_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                0.9875
recall                 0.264214
f-measure              0.416887
da                           80
dm                            0
ndm                           0
tp                           79
fp                            1
tn                  4.76529e+07
fn                          220
Name: (10, 1 - acm diverg, 907), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)907_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 410
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 410 weight vectors
  Containing 181 true matches and 229 true non-matches
    (44.15% true matches)
  Identified 389 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   379  (97.43%)
          2 :     7  (1.80%)
          3 :     2  (0.51%)
         11 :     1  (0.26%)

Identified 1 non-pure unique weight vectors (from 389 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 160
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 228

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 409
  Number of unique weight vectors: 389

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (389, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 389 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 389 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 37 matches and 40 non-matches
    Purity of oracle classification:  0.519
    Entropy of oracle classification: 0.999
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  40
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 312 weight vectors
  Based on 37 matches and 40 non-matches
  Classified 123 matches and 189 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.5194805194805194, 0.9989047442823606, 0.4805194805194805)
    (189, 0.5194805194805194, 0.9989047442823606, 0.4805194805194805)

Current size of match and non-match training data sets: 37 / 40

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 189 weight vectors
- Estimated match proportion 0.481

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 189 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.717, 1.000, 0.240, 0.231, 0.065, 0.192, 0.184] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.224, 0.219, 0.140, 0.209, 0.161] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 6 matches and 58 non-matches
    Purity of oracle classification:  0.906
    Entropy of oracle classification: 0.449
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

80.0
Analisando o arquivo: diverg(15)930_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 930), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)930_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 630
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 630 weight vectors
  Containing 199 true matches and 431 true non-matches
    (31.59% true matches)
  Identified 598 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   583  (97.49%)
          2 :    12  (2.01%)
          3 :     2  (0.33%)
         17 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 598 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 428

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 629
  Number of unique weight vectors: 598

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (598, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 598 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 598 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 515 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 142 matches and 373 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (373, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 142 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 47 matches and 7 non-matches
    Purity of oracle classification:  0.870
    Entropy of oracle classification: 0.556
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)946_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 946), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)946_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 179 matches and 760 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (760, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 760 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)934_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                0.9875
recall                 0.264214
f-measure              0.416887
da                           80
dm                            0
ndm                           0
tp                           79
fp                            1
tn                  4.76529e+07
fn                          220
Name: (10, 1 - acm diverg, 934), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)934_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 800
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 800 weight vectors
  Containing 186 true matches and 614 true non-matches
    (23.25% true matches)
  Identified 758 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   727  (95.91%)
          2 :    28  (3.69%)
          3 :     2  (0.26%)
         11 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 758 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 593

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 799
  Number of unique weight vectors: 758

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (758, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 758 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 758 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 673 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 134 matches and 539 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (134, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (539, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 134 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 134 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 45 matches and 8 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

80.0
Analisando o arquivo: diverg(15)86_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 86), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)86_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 682
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 682 weight vectors
  Containing 219 true matches and 463 true non-matches
    (32.11% true matches)
  Identified 649 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   633  (97.53%)
          2 :    13  (2.00%)
          3 :     2  (0.31%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 649 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 462

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 681
  Number of unique weight vectors: 649

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (649, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 649 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 649 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 566 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 145 matches and 421 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (421, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 421 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 421 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 10 matches and 62 non-matches
    Purity of oracle classification:  0.861
    Entropy of oracle classification: 0.581
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)76_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 76), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)76_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 626
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 626 weight vectors
  Containing 155 true matches and 471 true non-matches
    (24.76% true matches)
  Identified 590 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   562  (95.25%)
          2 :    25  (4.24%)
          3 :     2  (0.34%)
          8 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 590 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 139
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 450

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 618
  Number of unique weight vectors: 589

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (589, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 589 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 589 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 24 matches and 58 non-matches
    Purity of oracle classification:  0.707
    Entropy of oracle classification: 0.872
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 507 weight vectors
  Based on 24 matches and 58 non-matches
  Classified 91 matches and 416 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.7073170731707317, 0.8721617883411701, 0.2926829268292683)
    (416, 0.7073170731707317, 0.8721617883411701, 0.2926829268292683)

Current size of match and non-match training data sets: 24 / 58

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 91 weight vectors
- Estimated match proportion 0.293

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 91 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 41 matches and 2 non-matches
    Purity of oracle classification:  0.953
    Entropy of oracle classification: 0.271
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)497_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 497), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)497_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 719
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 719 weight vectors
  Containing 208 true matches and 511 true non-matches
    (28.93% true matches)
  Identified 686 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   672  (97.96%)
          2 :    11  (1.60%)
          3 :     2  (0.29%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 686 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 510

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 718
  Number of unique weight vectors: 686

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (686, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 686 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 686 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 602 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 127 matches and 475 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (127, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (475, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 475 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 475 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.462, 0.609, 0.643, 0.706, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.533, 0.667, 0.333, 0.714, 0.632] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)369_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 369), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)369_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 630
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 630 weight vectors
  Containing 209 true matches and 421 true non-matches
    (33.17% true matches)
  Identified 594 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   577  (97.14%)
          2 :    14  (2.36%)
          3 :     2  (0.34%)
         19 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 594 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 418

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 629
  Number of unique weight vectors: 594

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (594, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 594 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 594 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 512 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 150 matches and 362 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (362, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 150 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 51 matches and 6 non-matches
    Purity of oracle classification:  0.895
    Entropy of oracle classification: 0.485
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)409_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (10, 1 - acm diverg, 409), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)409_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 713
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 713 weight vectors
  Containing 169 true matches and 544 true non-matches
    (23.70% true matches)
  Identified 676 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   645  (95.41%)
          2 :    28  (4.14%)
          3 :     2  (0.30%)
          6 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 676 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.000 : 524

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 713
  Number of unique weight vectors: 676

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (676, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 676 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 676 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 592 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 120 matches and 472 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (120, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (472, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 120 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 120 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 44 matches and 6 non-matches
    Purity of oracle classification:  0.880
    Entropy of oracle classification: 0.529
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(20)524_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 524), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)524_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1042
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1042 weight vectors
  Containing 222 true matches and 820 true non-matches
    (21.31% true matches)
  Identified 988 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   951  (96.26%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 988 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 799

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1041
  Number of unique weight vectors: 988

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (988, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 988 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 988 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 901 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 145 matches and 756 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (756, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 145 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)69_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 69), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)69_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 391
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 391 weight vectors
  Containing 214 true matches and 177 true non-matches
    (54.73% true matches)
  Identified 358 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   342  (95.53%)
          2 :    13  (3.63%)
          3 :     2  (0.56%)
         17 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 358 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 176

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 390
  Number of unique weight vectors: 358

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (358, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 358 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 358 vectors
  The selected farthest weight vectors are:
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 44 matches and 32 non-matches
    Purity of oracle classification:  0.579
    Entropy of oracle classification: 0.982
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  32
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 282 weight vectors
  Based on 44 matches and 32 non-matches
  Classified 140 matches and 142 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.5789473684210527, 0.9819407868640977, 0.5789473684210527)
    (142, 0.5789473684210527, 0.9819407868640977, 0.5789473684210527)

Current size of match and non-match training data sets: 44 / 32

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 142 weight vectors
- Estimated match proportion 0.579

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.146, 0.130, 0.176, 0.318, 0.167] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.821, 1.000, 0.275, 0.297, 0.227, 0.255, 0.152] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.881, 1.000, 0.211, 0.250, 0.129, 0.250, 0.211] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.929, 1.000, 0.182, 0.238, 0.188, 0.146, 0.270] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.750, 1.000, 0.243, 0.243, 0.214, 0.111, 0.132] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.592, 1.000, 0.179, 0.205, 0.156, 0.273, 0.180] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 1.000, 0.224, 0.219, 0.140, 0.209, 0.161] (False)
    [0.663, 1.000, 0.132, 0.143, 0.241, 0.174, 0.167] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.902, 1.000, 0.182, 0.071, 0.182, 0.222, 0.190] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.747, 1.000, 0.231, 0.167, 0.107, 0.222, 0.125] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 4 matches and 53 non-matches
    Purity of oracle classification:  0.930
    Entropy of oracle classification: 0.367
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)979_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 979), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)979_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 790
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 790 weight vectors
  Containing 218 true matches and 572 true non-matches
    (27.59% true matches)
  Identified 752 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   734  (97.61%)
          2 :    15  (1.99%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 752 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 569

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 789
  Number of unique weight vectors: 752

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (752, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 752 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 752 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 667 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 143 matches and 524 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (524, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 143 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 143 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 48 matches and 6 non-matches
    Purity of oracle classification:  0.889
    Entropy of oracle classification: 0.503
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)130_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976562
recall                  0.41806
f-measure               0.58548
da                          128
dm                            0
ndm                           0
tp                          125
fp                            3
tn                  4.76529e+07
fn                          174
Name: (10, 1 - acm diverg, 130), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)130_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 591
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 591 weight vectors
  Containing 133 true matches and 458 true non-matches
    (22.50% true matches)
  Identified 560 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   532  (95.00%)
          2 :    25  (4.46%)
          3 :     3  (0.54%)

Identified 0 non-pure unique weight vectors (from 560 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 122
     0.000 : 438

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 591
  Number of unique weight vectors: 560

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (560, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 560 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 560 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 25 matches and 57 non-matches
    Purity of oracle classification:  0.695
    Entropy of oracle classification: 0.887
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 478 weight vectors
  Based on 25 matches and 57 non-matches
  Classified 72 matches and 406 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (72, 0.6951219512195121, 0.8871723027673717, 0.3048780487804878)
    (406, 0.6951219512195121, 0.8871723027673717, 0.3048780487804878)

Current size of match and non-match training data sets: 25 / 57

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.89
- Size 72 weight vectors
- Estimated match proportion 0.305

Sample size for this cluster: 39

Farthest first selection of 39 weight vectors from 72 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.511, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 39 weight vectors
  The oracle will correctly classify 39 weight vectors and wrongly classify 0
  Classified 39 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 39 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

128.0
Analisando o arquivo: diverg(20)836_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 836), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)836_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 793 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)233_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 233), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)233_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1059
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1059 weight vectors
  Containing 227 true matches and 832 true non-matches
    (21.44% true matches)
  Identified 1002 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   965  (96.31%)
          2 :    34  (3.39%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1002 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 811

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1058
  Number of unique weight vectors: 1002

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1002, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1002 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1002 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 915 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 177 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (177, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (738, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 738 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)347_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 347), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)347_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 367
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 367 weight vectors
  Containing 194 true matches and 173 true non-matches
    (52.86% true matches)
  Identified 340 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   324  (95.29%)
          2 :    13  (3.82%)
          3 :     2  (0.59%)
         11 :     1  (0.29%)

Identified 1 non-pure unique weight vectors (from 340 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 170

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 366
  Number of unique weight vectors: 340

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (340, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 340 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 340 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 33 matches and 42 non-matches
    Purity of oracle classification:  0.560
    Entropy of oracle classification: 0.990
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  42
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 265 weight vectors
  Based on 33 matches and 42 non-matches
  Classified 138 matches and 127 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.56, 0.9895875212220557, 0.44)
    (127, 0.56, 0.9895875212220557, 0.44)

Current size of match and non-match training data sets: 33 / 42

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 138 weight vectors
- Estimated match proportion 0.440

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 138 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 53 matches and 3 non-matches
    Purity of oracle classification:  0.946
    Entropy of oracle classification: 0.301
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)952_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (20, 1 - acm diverg, 952), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)952_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1026
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1026 weight vectors
  Containing 198 true matches and 828 true non-matches
    (19.30% true matches)
  Identified 984 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   949  (96.44%)
          2 :    32  (3.25%)
          3 :     2  (0.20%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 984 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 808

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 984

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (984, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 984 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 984 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 897 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 93 matches and 804 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (93, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (804, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 804 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 804 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)145_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 145), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)145_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 290
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 290 weight vectors
  Containing 192 true matches and 98 true non-matches
    (66.21% true matches)
  Identified 267 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   255  (95.51%)
          2 :     9  (3.37%)
          3 :     2  (0.75%)
         11 :     1  (0.37%)

Identified 1 non-pure unique weight vectors (from 267 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 97

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 289
  Number of unique weight vectors: 267

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (267, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 267 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Perform initial selection using "far" method

Farthest first selection of 71 weight vectors from 267 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 36 matches and 35 non-matches
    Purity of oracle classification:  0.507
    Entropy of oracle classification: 1.000
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  35
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 196 weight vectors
  Based on 36 matches and 35 non-matches
  Classified 137 matches and 59 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 71
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.5070422535211268, 0.9998568991526107, 0.5070422535211268)
    (59, 0.5070422535211268, 0.9998568991526107, 0.5070422535211268)

Current size of match and non-match training data sets: 36 / 35

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 59 weight vectors
- Estimated match proportion 0.507

Sample size for this cluster: 37

Farthest first selection of 37 weight vectors from 59 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.456, 1.000, 0.087, 0.208, 0.125, 0.152, 0.061] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.800, 1.000, 0.242, 0.121, 0.200, 0.171, 0.000] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)

Perform oracle with 100.00 accuracy on 37 weight vectors
  The oracle will correctly classify 37 weight vectors and wrongly classify 0
  Classified 3 matches and 34 non-matches
    Purity of oracle classification:  0.919
    Entropy of oracle classification: 0.406
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  34
    Number of false non-matches: 0

Deleted 37 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)668_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 668), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)668_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1043
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1043 weight vectors
  Containing 222 true matches and 821 true non-matches
    (21.28% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   952  (96.26%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 800

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1042
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 145 matches and 757 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (757, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 145 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)591_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 591), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)591_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 745
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 745 weight vectors
  Containing 223 true matches and 522 true non-matches
    (29.93% true matches)
  Identified 709 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   693  (97.74%)
          2 :    13  (1.83%)
          3 :     2  (0.28%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 709 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 521

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 744
  Number of unique weight vectors: 709

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (709, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 709 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 709 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 625 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 135 matches and 490 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (490, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 135 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 135 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.909, 1.000, 1.000, 1.000, 0.947] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.420, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 48 matches and 4 non-matches
    Purity of oracle classification:  0.923
    Entropy of oracle classification: 0.391
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)827_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 827), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)827_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 790
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 790 weight vectors
  Containing 208 true matches and 582 true non-matches
    (26.33% true matches)
  Identified 761 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   744  (97.77%)
          2 :    14  (1.84%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 761 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 789
  Number of unique weight vectors: 761

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (761, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 761 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 761 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 676 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 139 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (139, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (537, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 537 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 537 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.556, 0.429, 0.500, 0.700, 0.643] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 7 matches and 68 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.447
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)723_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.982143
recall                 0.183946
f-measure              0.309859
da                           56
dm                            0
ndm                           0
tp                           55
fp                            1
tn                  4.76529e+07
fn                          244
Name: (10, 1 - acm diverg, 723), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)723_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 204 true matches and 749 true non-matches
    (21.41% true matches)
  Identified 902 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   868  (96.23%)
          2 :    31  (3.44%)
          3 :     2  (0.22%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 902 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 728

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 902

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (902, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 902 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 902 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 23 matches and 63 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.838
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 816 weight vectors
  Based on 23 matches and 63 non-matches
  Classified 107 matches and 709 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (107, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)
    (709, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)

Current size of match and non-match training data sets: 23 / 63

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.84
- Size 709 weight vectors
- Estimated match proportion 0.267

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 709 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

56.0
Analisando o arquivo: diverg(15)364_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981481
recall                 0.177258
f-measure              0.300283
da                           54
dm                            0
ndm                           0
tp                           53
fp                            1
tn                  4.76529e+07
fn                          246
Name: (15, 1 - acm diverg, 364), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)364_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1061
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1061 weight vectors
  Containing 213 true matches and 848 true non-matches
    (20.08% true matches)
  Identified 1007 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   972  (96.52%)
          2 :    32  (3.18%)
          3 :     2  (0.20%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1007 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1060
  Number of unique weight vectors: 1007

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1007, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1007 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1007 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 920 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 100 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (100, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 100 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 100 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 43 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

54.0
Analisando o arquivo: diverg(15)471_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976562
recall                  0.41806
f-measure               0.58548
da                          128
dm                            0
ndm                           0
tp                          125
fp                            3
tn                  4.76529e+07
fn                          174
Name: (15, 1 - acm diverg, 471), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)471_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 823
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 823 weight vectors
  Containing 130 true matches and 693 true non-matches
    (15.80% true matches)
  Identified 792 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   764  (96.46%)
          2 :    25  (3.16%)
          3 :     3  (0.38%)

Identified 0 non-pure unique weight vectors (from 792 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 119
     0.000 : 673

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 823
  Number of unique weight vectors: 792

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (792, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 792 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 792 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 707 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 99 matches and 608 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (99, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (608, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 608 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 608 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 0 matches and 72 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

128.0
Analisando o arquivo: diverg(15)766_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 766), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)766_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 854
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 854 weight vectors
  Containing 221 true matches and 633 true non-matches
    (25.88% true matches)
  Identified 798 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   762  (95.49%)
          2 :    33  (4.14%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 798 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 612

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 853
  Number of unique weight vectors: 798

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (798, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 798 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 798 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 713 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 563 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (563, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 563 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 563 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 5 matches and 69 non-matches
    Purity of oracle classification:  0.932
    Entropy of oracle classification: 0.357
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)574_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 574), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)574_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 118 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 118 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)510_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 510), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)510_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1027
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1027 weight vectors
  Containing 223 true matches and 804 true non-matches
    (21.71% true matches)
  Identified 973 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   936  (96.20%)
          2 :    34  (3.49%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 973 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 783

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 973

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (973, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 973 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 973 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 886 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 131 matches and 755 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (755, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 755 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 755 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)955_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (20, 1 - acm diverg, 955), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)955_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 201 true matches and 752 true non-matches
    (21.09% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.26%)
          2 :    31  (3.41%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 110 matches and 711 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (110, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (711, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 711 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 711 vectors
  The selected farthest weight vectors are:
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 13 matches and 58 non-matches
    Purity of oracle classification:  0.817
    Entropy of oracle classification: 0.687
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)902_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 902), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)902_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 711
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 711 weight vectors
  Containing 203 true matches and 508 true non-matches
    (28.55% true matches)
  Identified 685 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   671  (97.96%)
          2 :    11  (1.61%)
          3 :     2  (0.29%)
         12 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 685 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 507

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 710
  Number of unique weight vectors: 685

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (685, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 685 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 685 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 601 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 137 matches and 464 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (464, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 137 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 137 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 49 matches and 4 non-matches
    Purity of oracle classification:  0.925
    Entropy of oracle classification: 0.386
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)385_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 385), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)385_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 502
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 502 weight vectors
  Containing 188 true matches and 314 true non-matches
    (37.45% true matches)
  Identified 474 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   461  (97.26%)
          2 :    10  (2.11%)
          3 :     2  (0.42%)
         15 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 474 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 160
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 313

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 501
  Number of unique weight vectors: 474

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (474, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 474 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 474 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.364, 0.619, 0.471, 0.600, 0.533] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 29 matches and 51 non-matches
    Purity of oracle classification:  0.637
    Entropy of oracle classification: 0.945
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 394 weight vectors
  Based on 29 matches and 51 non-matches
  Classified 136 matches and 258 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6375, 0.944738828646789, 0.3625)
    (258, 0.6375, 0.944738828646789, 0.3625)

Current size of match and non-match training data sets: 29 / 51

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 136 weight vectors
- Estimated match proportion 0.362

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.933, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)745_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 745), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)745_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 744
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 744 weight vectors
  Containing 220 true matches and 524 true non-matches
    (29.57% true matches)
  Identified 708 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   689  (97.32%)
          2 :    16  (2.26%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 708 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 521

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 743
  Number of unique weight vectors: 708

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (708, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 708 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 708 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 624 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 151 matches and 473 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (473, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 473 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 473 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 5 matches and 68 non-matches
    Purity of oracle classification:  0.932
    Entropy of oracle classification: 0.360
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)624_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 624), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)624_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 612
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 612 weight vectors
  Containing 191 true matches and 421 true non-matches
    (31.21% true matches)
  Identified 565 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   532  (94.16%)
          2 :    30  (5.31%)
          3 :     2  (0.35%)
         14 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 565 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 400

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 611
  Number of unique weight vectors: 565

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (565, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 565 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 565 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 483 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 134 matches and 349 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (134, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (349, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 134 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 134 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 48 matches and 4 non-matches
    Purity of oracle classification:  0.923
    Entropy of oracle classification: 0.391
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

65.0
Analisando o arquivo: diverg(15)953_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 953), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)953_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 847
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 847 weight vectors
  Containing 220 true matches and 627 true non-matches
    (25.97% true matches)
  Identified 791 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   755  (95.45%)
          2 :    33  (4.17%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 791 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 606

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 846
  Number of unique weight vectors: 791

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (791, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 791 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 791 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 706 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 142 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (564, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 564 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 564 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 10 matches and 62 non-matches
    Purity of oracle classification:  0.861
    Entropy of oracle classification: 0.581
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)450_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 450), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)450_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 854
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 854 weight vectors
  Containing 226 true matches and 628 true non-matches
    (26.46% true matches)
  Identified 797 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   760  (95.36%)
          2 :    34  (4.27%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 797 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 607

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 853
  Number of unique weight vectors: 797

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (797, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 797 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 797 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 712 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 148 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (564, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 564 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 564 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 10 matches and 62 non-matches
    Purity of oracle classification:  0.861
    Entropy of oracle classification: 0.581
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)500_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990385
recall                 0.344482
f-measure              0.511166
da                          104
dm                            0
ndm                           0
tp                          103
fp                            1
tn                  4.76529e+07
fn                          196
Name: (10, 1 - acm diverg, 500), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)500_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 991
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 991 weight vectors
  Containing 162 true matches and 829 true non-matches
    (16.35% true matches)
  Identified 952 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   923  (96.95%)
          2 :    26  (2.73%)
          3 :     2  (0.21%)
         10 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 952 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 143
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 990
  Number of unique weight vectors: 952

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (952, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 952 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 952 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 865 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 269 matches and 596 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (269, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (596, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 596 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 596 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

104.0
Analisando o arquivo: diverg(20)363_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 363), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)363_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 159 matches and 779 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (779, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 779 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 779 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 3 matches and 72 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)686_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 686), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)686_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)178_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (15, 1 - acm diverg, 178), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)178_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 733
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 733 weight vectors
  Containing 202 true matches and 531 true non-matches
    (27.56% true matches)
  Identified 701 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   685  (97.72%)
          2 :    13  (1.85%)
          3 :     2  (0.29%)
         16 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 701 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 528

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 732
  Number of unique weight vectors: 701

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (701, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 701 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 701 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 617 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 129 matches and 488 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (129, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (488, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 488 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 488 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 8 matches and 67 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.490
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(20)806_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 806), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)806_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 156 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (800, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 800 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 800 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 4 matches and 71 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.300
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)378_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 378), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)378_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)242_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 242), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)242_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 218 true matches and 735 true non-matches
    (22.88% true matches)
  Identified 898 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   862  (95.99%)
          2 :    33  (3.67%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 898 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 714

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 898

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (898, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 898 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 898 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 812 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 157 matches and 655 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (655, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 655 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 655 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 3 matches and 72 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)730_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 730), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)730_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 29 matches and 59 non-matches
    Purity of oracle classification:  0.670
    Entropy of oracle classification: 0.914
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 29 matches and 59 non-matches
  Classified 162 matches and 777 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)
    (777, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)

Current size of match and non-match training data sets: 29 / 59

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 162 weight vectors
- Estimated match proportion 0.330

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)348_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 348), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)348_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 159 matches and 779 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (779, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 779 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 779 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 3 matches and 72 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)811_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 811), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)811_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)625_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 625), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)625_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 224 true matches and 575 true non-matches
    (28.04% true matches)
  Identified 760 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   741  (97.50%)
          2 :    16  (2.11%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 760 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 572

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 760

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (760, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 760 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 675 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 149 matches and 526 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (526, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 149 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)940_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976744
recall                 0.421405
f-measure              0.588785
da                          129
dm                            0
ndm                           0
tp                          126
fp                            3
tn                  4.76529e+07
fn                          173
Name: (10, 1 - acm diverg, 940), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)940_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 519
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 519 weight vectors
  Containing 129 true matches and 390 true non-matches
    (24.86% true matches)
  Identified 506 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   496  (98.02%)
          2 :     7  (1.38%)
          3 :     3  (0.59%)

Identified 0 non-pure unique weight vectors (from 506 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 118
     0.000 : 388

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 519
  Number of unique weight vectors: 506

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (506, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 506 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 506 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.500, 0.679, 0.583, 0.588, 0.333] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.429, 0.571, 0.750, 0.600] (False)
    [1.000, 0.000, 0.435, 0.700, 0.600, 0.647, 0.714] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.950, 0.000, 0.619, 0.800, 0.478, 0.280, 0.625] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 30 matches and 51 non-matches
    Purity of oracle classification:  0.630
    Entropy of oracle classification: 0.951
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 425 weight vectors
  Based on 30 matches and 51 non-matches
  Classified 94 matches and 331 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6296296296296297, 0.9509560484549725, 0.37037037037037035)
    (331, 0.6296296296296297, 0.9509560484549725, 0.37037037037037035)

Current size of match and non-match training data sets: 30 / 51

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 331 weight vectors
- Estimated match proportion 0.370

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 331 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 2 matches and 68 non-matches
    Purity of oracle classification:  0.971
    Entropy of oracle classification: 0.187
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

129.0
Analisando o arquivo: diverg(20)925_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 925), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)925_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1092
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1092 weight vectors
  Containing 221 true matches and 871 true non-matches
    (20.24% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1000  (96.53%)
          2 :    33  (3.19%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 850

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1091
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 845 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (845, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 845 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 845 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)947_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 947), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)947_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 783
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 783 weight vectors
  Containing 208 true matches and 575 true non-matches
    (26.56% true matches)
  Identified 736 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   701  (95.24%)
          2 :    32  (4.35%)
          3 :     2  (0.27%)
         12 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 736 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 554

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 782
  Number of unique weight vectors: 736

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (736, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 736 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 736 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 31 matches and 54 non-matches
    Purity of oracle classification:  0.635
    Entropy of oracle classification: 0.947
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 651 weight vectors
  Based on 31 matches and 54 non-matches
  Classified 324 matches and 327 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (324, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)
    (327, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)

Current size of match and non-match training data sets: 31 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.95
- Size 327 weight vectors
- Estimated match proportion 0.365

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 327 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.556, 0.348, 0.467, 0.636, 0.412] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.269, 0.478, 0.750, 0.385, 0.455] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.571, 0.857, 0.583, 0.667, 0.889] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 0 matches and 70 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)481_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 481), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)481_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 667
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 667 weight vectors
  Containing 217 true matches and 450 true non-matches
    (32.53% true matches)
  Identified 630 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   612  (97.14%)
          2 :    15  (2.38%)
          3 :     2  (0.32%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 630 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 447

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 666
  Number of unique weight vectors: 630

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (630, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 630 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 630 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 547 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 135 matches and 412 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (412, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 412 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 412 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 12 matches and 58 non-matches
    Purity of oracle classification:  0.829
    Entropy of oracle classification: 0.661
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)233_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 233), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)233_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)930_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 930), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)930_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)692_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 692), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)692_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 472
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 472 weight vectors
  Containing 223 true matches and 249 true non-matches
    (47.25% true matches)
  Identified 436 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   420  (96.33%)
          2 :    13  (2.98%)
          3 :     2  (0.46%)
         20 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 436 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 248

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 471
  Number of unique weight vectors: 436

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (436, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 436 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 436 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 34 matches and 45 non-matches
    Purity of oracle classification:  0.570
    Entropy of oracle classification: 0.986
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 357 weight vectors
  Based on 34 matches and 45 non-matches
  Classified 148 matches and 209 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.569620253164557, 0.9859690274511927, 0.43037974683544306)
    (209, 0.569620253164557, 0.9859690274511927, 0.43037974683544306)

Current size of match and non-match training data sets: 34 / 45

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.99
- Size 148 weight vectors
- Estimated match proportion 0.430

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 51 matches and 7 non-matches
    Purity of oracle classification:  0.879
    Entropy of oracle classification: 0.531
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)279_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 279), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)279_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 913
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 913 weight vectors
  Containing 204 true matches and 709 true non-matches
    (22.34% true matches)
  Identified 862 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   828  (96.06%)
          2 :    31  (3.60%)
          3 :     2  (0.23%)
         17 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 862 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 688

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 912
  Number of unique weight vectors: 862

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (862, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 862 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 862 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 776 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 146 matches and 630 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (630, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 146 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 48 matches and 6 non-matches
    Purity of oracle classification:  0.889
    Entropy of oracle classification: 0.503
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)920_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 920), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)920_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 962
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 962 weight vectors
  Containing 212 true matches and 750 true non-matches
    (22.04% true matches)
  Identified 909 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.15%)
          2 :    32  (3.52%)
          3 :     2  (0.22%)
         18 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 909 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 729

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 961
  Number of unique weight vectors: 909

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (909, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 909 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 909 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 822 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 117 matches and 705 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (117, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (705, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 117 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 117 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(10)518_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.977444
recall                 0.434783
f-measure              0.601852
da                          133
dm                            0
ndm                           0
tp                          130
fp                            3
tn                  4.76529e+07
fn                          169
Name: (10, 1 - acm diverg, 518), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)518_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 496
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 496 weight vectors
  Containing 121 true matches and 375 true non-matches
    (24.40% true matches)
  Identified 484 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   475  (98.14%)
          2 :     6  (1.24%)
          3 :     3  (0.62%)

Identified 0 non-pure unique weight vectors (from 484 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 111
     0.000 : 373

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 496
  Number of unique weight vectors: 484

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (484, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 484 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 484 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.500, 0.679, 0.583, 0.588, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.429, 0.571, 0.750, 0.600] (False)
    [1.000, 0.000, 0.435, 0.700, 0.600, 0.647, 0.714] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.950, 0.000, 0.619, 0.800, 0.478, 0.280, 0.625] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 29 matches and 51 non-matches
    Purity of oracle classification:  0.637
    Entropy of oracle classification: 0.945
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 404 weight vectors
  Based on 29 matches and 51 non-matches
  Classified 90 matches and 314 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (90, 0.6375, 0.944738828646789, 0.3625)
    (314, 0.6375, 0.944738828646789, 0.3625)

Current size of match and non-match training data sets: 29 / 51

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 90 weight vectors
- Estimated match proportion 0.362

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 90 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 37 matches and 8 non-matches
    Purity of oracle classification:  0.822
    Entropy of oracle classification: 0.675
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

133.0
Analisando o arquivo: diverg(15)490_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 490), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)490_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 581
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 581 weight vectors
  Containing 187 true matches and 394 true non-matches
    (32.19% true matches)
  Identified 559 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   543  (97.14%)
          2 :    13  (2.33%)
          3 :     2  (0.36%)
          6 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 559 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 392

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 581
  Number of unique weight vectors: 559

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (559, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 559 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 559 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 477 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 141 matches and 336 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (336, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 141 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 47 matches and 7 non-matches
    Purity of oracle classification:  0.870
    Entropy of oracle classification: 0.556
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)850_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 850), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)850_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 148 matches and 784 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (784, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 784 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 784 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 8 matches and 66 non-matches
    Purity of oracle classification:  0.892
    Entropy of oracle classification: 0.494
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)1000_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 1000), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)1000_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)446_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 446), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)446_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1100
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1100 weight vectors
  Containing 227 true matches and 873 true non-matches
    (20.64% true matches)
  Identified 1043 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1006  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1043 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1099
  Number of unique weight vectors: 1043

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1043, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1043 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1043 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 955 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 156 matches and 799 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (799, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 156 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)248_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 248), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)248_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)396_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 396), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)396_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 118 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 118 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)594_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 594), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)594_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1027
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1027 weight vectors
  Containing 223 true matches and 804 true non-matches
    (21.71% true matches)
  Identified 973 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   936  (96.20%)
          2 :    34  (3.49%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 973 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 783

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 973

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (973, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 973 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 973 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 886 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 131 matches and 755 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (755, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 755 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 755 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)160_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 160), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)160_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 375
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 375 weight vectors
  Containing 196 true matches and 179 true non-matches
    (52.27% true matches)
  Identified 346 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   333  (96.24%)
          2 :    10  (2.89%)
          3 :     2  (0.58%)
         16 :     1  (0.29%)

Identified 1 non-pure unique weight vectors (from 346 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 178

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 374
  Number of unique weight vectors: 346

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (346, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 346 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 346 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 41 matches and 34 non-matches
    Purity of oracle classification:  0.547
    Entropy of oracle classification: 0.994
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  34
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 271 weight vectors
  Based on 41 matches and 34 non-matches
  Classified 130 matches and 141 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.5466666666666666, 0.993707106604508, 0.5466666666666666)
    (141, 0.5466666666666666, 0.993707106604508, 0.5466666666666666)

Current size of match and non-match training data sets: 41 / 34

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 130 weight vectors
- Estimated match proportion 0.547

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 49 matches and 6 non-matches
    Purity of oracle classification:  0.891
    Entropy of oracle classification: 0.497
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(20)905_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 905), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)905_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)382_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 382), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)382_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 227 true matches and 848 true non-matches
    (21.12% true matches)
  Identified 1018 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   981  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1018 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1018

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1018, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1018 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1018 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 931 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 819 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (819, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)444_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 444), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)444_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 179 matches and 760 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (760, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 179 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 179 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 43 matches and 15 non-matches
    Purity of oracle classification:  0.741
    Entropy of oracle classification: 0.825
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  15
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)869_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 869), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)869_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 141 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)468_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (15, 1 - acm diverg, 468), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)468_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 408
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 408 weight vectors
  Containing 142 true matches and 266 true non-matches
    (34.80% true matches)
  Identified 392 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   381  (97.19%)
          2 :     8  (2.04%)
          3 :     2  (0.51%)
          5 :     1  (0.26%)

Identified 0 non-pure unique weight vectors (from 392 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 128
     0.000 : 264

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 408
  Number of unique weight vectors: 392

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (392, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 392 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 392 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 32 matches and 45 non-matches
    Purity of oracle classification:  0.584
    Entropy of oracle classification: 0.979
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 315 weight vectors
  Based on 32 matches and 45 non-matches
  Classified 89 matches and 226 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (89, 0.5844155844155844, 0.9793399259567799, 0.4155844155844156)
    (226, 0.5844155844155844, 0.9793399259567799, 0.4155844155844156)

Current size of match and non-match training data sets: 32 / 45

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 226 weight vectors
- Estimated match proportion 0.416

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 226 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 8 matches and 58 non-matches
    Purity of oracle classification:  0.879
    Entropy of oracle classification: 0.533
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(15)228_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 228), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)228_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 222 true matches and 577 true non-matches
    (27.78% true matches)
  Identified 745 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   708  (95.03%)
          2 :    34  (4.56%)
          3 :     2  (0.27%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 745 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 556

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 745

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (745, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 745 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 745 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 660 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 147 matches and 513 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (513, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 147 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 147 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)18_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 18), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)18_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 722
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 722 weight vectors
  Containing 197 true matches and 525 true non-matches
    (27.29% true matches)
  Identified 680 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   645  (94.85%)
          2 :    32  (4.71%)
          3 :     2  (0.29%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 680 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.000 : 505

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 722
  Number of unique weight vectors: 680

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (680, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 680 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 680 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 596 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 290 matches and 306 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (290, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (306, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 306 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 306 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.767, 0.545, 0.818, 0.714, 0.773] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.423, 0.478, 0.357, 0.615, 0.727] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [0.800, 0.000, 0.625, 0.571, 0.467, 0.474, 0.667] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.778, 0.500, 0.789, 0.750, 0.385] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.333, 0.600, 0.800, 0.778, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.833, 0.833, 0.550, 0.500, 0.688] (False)
    [1.000, 0.000, 0.600, 0.857, 0.579, 0.286, 0.545] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.875, 0.467, 0.471, 0.833, 0.571] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.857, 0.000, 0.500, 0.389, 0.235, 0.045, 0.526] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.556, 0.364, 0.583, 0.500, 0.636] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.000, 0.700, 0.818, 0.444, 0.619] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 0 matches and 68 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)32_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 32), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)32_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 29 matches and 59 non-matches
    Purity of oracle classification:  0.670
    Entropy of oracle classification: 0.914
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 29 matches and 59 non-matches
  Classified 162 matches and 777 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (162, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)
    (777, 0.6704545454545454, 0.9144612916935675, 0.32954545454545453)

Current size of match and non-match training data sets: 29 / 59

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 162 weight vectors
- Estimated match proportion 0.330

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 162 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)979_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 979), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)979_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 146 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (538, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 146 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 50 matches and 4 non-matches
    Purity of oracle classification:  0.926
    Entropy of oracle classification: 0.381
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)499_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984848
recall                 0.217391
f-measure              0.356164
da                           66
dm                            0
ndm                           0
tp                           65
fp                            1
tn                  4.76529e+07
fn                          234
Name: (10, 1 - acm diverg, 499), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)499_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 296
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 296 weight vectors
  Containing 185 true matches and 111 true non-matches
    (62.50% true matches)
  Identified 268 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   254  (94.78%)
          2 :    11  (4.10%)
          3 :     2  (0.75%)
         14 :     1  (0.37%)

Identified 1 non-pure unique weight vectors (from 268 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 108

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 295
  Number of unique weight vectors: 268

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (268, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 268 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Perform initial selection using "far" method

Farthest first selection of 71 weight vectors from 268 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 34 matches and 37 non-matches
    Purity of oracle classification:  0.521
    Entropy of oracle classification: 0.999
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 197 weight vectors
  Based on 34 matches and 37 non-matches
  Classified 127 matches and 70 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 71
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (127, 0.5211267605633803, 0.9987117514654895, 0.4788732394366197)
    (70, 0.5211267605633803, 0.9987117514654895, 0.4788732394366197)

Current size of match and non-match training data sets: 34 / 37

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 127 weight vectors
- Estimated match proportion 0.479

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 127 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

66.0
Analisando o arquivo: diverg(10)486_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 486), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)486_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 637
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 637 weight vectors
  Containing 205 true matches and 432 true non-matches
    (32.18% true matches)
  Identified 606 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   592  (97.69%)
          2 :    11  (1.82%)
          3 :     2  (0.33%)
         17 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 606 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 431

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 636
  Number of unique weight vectors: 606

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (606, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 606 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 606 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 33 matches and 50 non-matches
    Purity of oracle classification:  0.602
    Entropy of oracle classification: 0.970
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 523 weight vectors
  Based on 33 matches and 50 non-matches
  Classified 279 matches and 244 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (279, 0.6024096385542169, 0.9695235828220428, 0.39759036144578314)
    (244, 0.6024096385542169, 0.9695235828220428, 0.39759036144578314)

Current size of match and non-match training data sets: 33 / 50

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 244 weight vectors
- Estimated match proportion 0.398

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 244 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.533, 0.000, 0.577, 0.783, 0.429, 0.615, 0.478] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.762, 0.714, 0.500, 0.400] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.462, 0.667, 0.600, 0.389, 0.615] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [1.000, 0.000, 0.435, 0.700, 0.600, 0.647, 0.714] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.667, 0.722, 0.353, 0.545, 0.800] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 0 matches and 67 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)831_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 831), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)831_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)73_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 73), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)73_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 861
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 861 weight vectors
  Containing 227 true matches and 634 true non-matches
    (26.36% true matches)
  Identified 804 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   767  (95.40%)
          2 :    34  (4.23%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 804 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 860
  Number of unique weight vectors: 804

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (804, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 804 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 804 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 718 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 565 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (565, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 565 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 565 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)307_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 307), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)307_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 638
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 638 weight vectors
  Containing 187 true matches and 451 true non-matches
    (29.31% true matches)
  Identified 598 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   564  (94.31%)
          2 :    31  (5.18%)
          3 :     2  (0.33%)
          6 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 598 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.000 : 431

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 638
  Number of unique weight vectors: 598

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (598, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 598 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 598 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 515 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 146 matches and 369 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (369, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 369 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 369 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 1 matches and 68 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.109
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)71_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 71), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)71_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 789
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 789 weight vectors
  Containing 225 true matches and 564 true non-matches
    (28.52% true matches)
  Identified 750 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   731  (97.47%)
          2 :    16  (2.13%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 750 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 561

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 788
  Number of unique weight vectors: 750

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (750, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 750 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 750 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 34 matches and 51 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 665 weight vectors
  Based on 34 matches and 51 non-matches
  Classified 153 matches and 512 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6, 0.9709505944546686, 0.4)
    (512, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 34 / 51

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 512 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 78

Farthest first selection of 78 weight vectors from 512 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 4 matches and 74 non-matches
    Purity of oracle classification:  0.949
    Entropy of oracle classification: 0.292
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)342_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 342), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)342_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 722
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 722 weight vectors
  Containing 208 true matches and 514 true non-matches
    (28.81% true matches)
  Identified 689 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   675  (97.97%)
          2 :    11  (1.60%)
          3 :     2  (0.29%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 689 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 513

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 721
  Number of unique weight vectors: 689

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (689, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 689 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 689 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 605 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 111 matches and 494 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (111, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (494, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 111 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 111 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)472_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 472), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)472_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 970
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 970 weight vectors
  Containing 219 true matches and 751 true non-matches
    (22.58% true matches)
  Identified 915 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   879  (96.07%)
          2 :    33  (3.61%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 915 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 730

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 969
  Number of unique weight vectors: 915

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (915, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 915 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 915 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 828 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 148 matches and 680 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (680, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 680 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 680 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)362_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 362), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)362_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 343
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 343 weight vectors
  Containing 185 true matches and 158 true non-matches
    (53.94% true matches)
  Identified 322 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   308  (95.65%)
          2 :    11  (3.42%)
          3 :     2  (0.62%)
          7 :     1  (0.31%)

Identified 0 non-pure unique weight vectors (from 322 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.000 : 158

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 343
  Number of unique weight vectors: 322

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (322, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 322 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 322 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.800, 1.000, 0.111, 0.200, 0.100, 0.194, 0.094] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 47 matches and 27 non-matches
    Purity of oracle classification:  0.635
    Entropy of oracle classification: 0.947
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  27
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 248 weight vectors
  Based on 47 matches and 27 non-matches
  Classified 227 matches and 21 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 74
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (227, 0.6351351351351351, 0.9466474387740497, 0.6351351351351351)
    (21, 0.6351351351351351, 0.9466474387740497, 0.6351351351351351)

Current size of match and non-match training data sets: 47 / 27

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.95
- Size 227 weight vectors
- Estimated match proportion 0.635

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 227 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.800, 1.000, 0.211, 0.133, 0.074, 0.133, 0.185] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 40 matches and 24 non-matches
    Purity of oracle classification:  0.625
    Entropy of oracle classification: 0.954
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  24
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(15)554_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (15, 1 - acm diverg, 554), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)554_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 910
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 910 weight vectors
  Containing 177 true matches and 733 true non-matches
    (19.45% true matches)
  Identified 871 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   841  (96.56%)
          2 :    27  (3.10%)
          3 :     2  (0.23%)
          9 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 871 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 158
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 712

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 901
  Number of unique weight vectors: 870

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (870, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 870 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 870 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 784 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 132 matches and 652 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (132, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (652, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 652 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 652 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 2 matches and 73 non-matches
    Purity of oracle classification:  0.973
    Entropy of oracle classification: 0.177
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(10)28_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990385
recall                 0.344482
f-measure              0.511166
da                          104
dm                            0
ndm                           0
tp                          103
fp                            1
tn                  4.76529e+07
fn                          196
Name: (10, 1 - acm diverg, 28), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)28_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 576
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 576 weight vectors
  Containing 150 true matches and 426 true non-matches
    (26.04% true matches)
  Identified 559 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   550  (98.39%)
          2 :     6  (1.07%)
          3 :     2  (0.36%)
          8 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 559 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 135
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 423

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 568
  Number of unique weight vectors: 558

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (558, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 558 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 558 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 476 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 103 matches and 373 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (373, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 103 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 103 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 44 matches and 5 non-matches
    Purity of oracle classification:  0.898
    Entropy of oracle classification: 0.475
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

104.0
Analisando o arquivo: diverg(10)844_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 844), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)844_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 300
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 300 weight vectors
  Containing 199 true matches and 101 true non-matches
    (66.33% true matches)
  Identified 267 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   250  (93.63%)
          2 :    14  (5.24%)
          3 :     2  (0.75%)
         16 :     1  (0.37%)

Identified 1 non-pure unique weight vectors (from 267 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 98

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 299
  Number of unique weight vectors: 267

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (267, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 267 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Perform initial selection using "far" method

Farthest first selection of 71 weight vectors from 267 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 34 matches and 37 non-matches
    Purity of oracle classification:  0.521
    Entropy of oracle classification: 0.999
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 196 weight vectors
  Based on 34 matches and 37 non-matches
  Classified 142 matches and 54 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 71
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.5211267605633803, 0.9987117514654895, 0.4788732394366197)
    (54, 0.5211267605633803, 0.9987117514654895, 0.4788732394366197)

Current size of match and non-match training data sets: 34 / 37

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 142 weight vectors
- Estimated match proportion 0.479

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 51 matches and 6 non-matches
    Purity of oracle classification:  0.895
    Entropy of oracle classification: 0.485
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)162_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 162), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)162_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 766
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 766 weight vectors
  Containing 205 true matches and 561 true non-matches
    (26.76% true matches)
  Identified 737 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   720  (97.69%)
          2 :    14  (1.90%)
          3 :     2  (0.27%)
         12 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 737 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 558

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 765
  Number of unique weight vectors: 737

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (737, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 737 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 737 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 34 matches and 51 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 652 weight vectors
  Based on 34 matches and 51 non-matches
  Classified 150 matches and 502 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6, 0.9709505944546686, 0.4)
    (502, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 34 / 51

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 150 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 48 matches and 9 non-matches
    Purity of oracle classification:  0.842
    Entropy of oracle classification: 0.629
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)49_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 49), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)49_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)961_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 961), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)961_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1007
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1007 weight vectors
  Containing 187 true matches and 820 true non-matches
    (18.57% true matches)
  Identified 965 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   934  (96.79%)
          2 :    28  (2.90%)
          3 :     2  (0.21%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 965 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 799

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1006
  Number of unique weight vectors: 965

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (965, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 965 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 965 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 878 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 307 matches and 571 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (307, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (571, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 307 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 307 vectors
  The selected farthest weight vectors are:
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 40 matches and 28 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.977
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(10)715_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 715), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)715_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 492
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 492 weight vectors
  Containing 172 true matches and 320 true non-matches
    (34.96% true matches)
  Identified 474 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   462  (97.47%)
          2 :     9  (1.90%)
          3 :     2  (0.42%)
          6 :     1  (0.21%)

Identified 0 non-pure unique weight vectors (from 474 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 154
     0.000 : 320

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 492
  Number of unique weight vectors: 474

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (474, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 474 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 474 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 26 matches and 54 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 394 weight vectors
  Based on 26 matches and 54 non-matches
  Classified 130 matches and 264 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.675, 0.9097361225311662, 0.325)
    (264, 0.675, 0.9097361225311662, 0.325)

Current size of match and non-match training data sets: 26 / 54

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 264 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 264 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [0.889, 0.000, 0.714, 0.700, 0.500, 0.636, 0.765] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.500, 0.826, 0.429, 0.538, 0.636] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.786, 0.737, 0.706, 0.318, 0.700] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.500, 0.739, 0.824, 0.591, 0.550] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.790, 0.000, 0.636, 0.619, 0.429, 0.450, 0.609] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.433, 0.737, 0.706, 0.500, 0.800] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 2 matches and 62 non-matches
    Purity of oracle classification:  0.969
    Entropy of oracle classification: 0.201
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(20)58_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 58), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)58_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 147 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (537, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 147 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 147 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 53 matches and 2 non-matches
    Purity of oracle classification:  0.964
    Entropy of oracle classification: 0.225
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)781_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 781), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)781_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 701
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 701 weight vectors
  Containing 216 true matches and 485 true non-matches
    (30.81% true matches)
  Identified 646 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   610  (94.43%)
          2 :    33  (5.11%)
          3 :     2  (0.31%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 646 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 464

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 700
  Number of unique weight vectors: 646

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (646, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 646 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 646 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 30 matches and 53 non-matches
    Purity of oracle classification:  0.639
    Entropy of oracle classification: 0.944
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 563 weight vectors
  Based on 30 matches and 53 non-matches
  Classified 202 matches and 361 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (202, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)
    (361, 0.6385542168674698, 0.943876757128791, 0.3614457831325301)

Current size of match and non-match training data sets: 30 / 53

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 202 weight vectors
- Estimated match proportion 0.361

Sample size for this cluster: 62

Farthest first selection of 62 weight vectors from 202 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.530, 1.000, 0.159, 0.086, 0.182, 0.159, 0.163] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 62 weight vectors
  The oracle will correctly classify 62 weight vectors and wrongly classify 0
  Classified 43 matches and 19 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.889
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  19
    Number of false non-matches: 0

Deleted 62 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)725_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 725), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)725_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 177 matches and 761 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (177, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (761, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 761 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 761 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)135_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 135), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)135_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 722
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 722 weight vectors
  Containing 217 true matches and 505 true non-matches
    (30.06% true matches)
  Identified 667 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   631  (94.60%)
          2 :    33  (4.95%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 667 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 484

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 721
  Number of unique weight vectors: 667

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (667, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 667 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 667 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 23 matches and 61 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 583 weight vectors
  Based on 23 matches and 61 non-matches
  Classified 0 matches and 583 non-matches

40.0
Analisando o arquivo: diverg(15)435_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 435), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)435_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 441
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 441 weight vectors
  Containing 196 true matches and 245 true non-matches
    (44.44% true matches)
  Identified 417 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   400  (95.92%)
          2 :    14  (3.36%)
          3 :     2  (0.48%)
          7 :     1  (0.24%)

Identified 0 non-pure unique weight vectors (from 417 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 243

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 441
  Number of unique weight vectors: 417

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (417, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 417 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 417 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 36 matches and 42 non-matches
    Purity of oracle classification:  0.538
    Entropy of oracle classification: 0.996
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  42
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 339 weight vectors
  Based on 36 matches and 42 non-matches
  Classified 134 matches and 205 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (134, 0.5384615384615384, 0.9957274520849256, 0.46153846153846156)
    (205, 0.5384615384615384, 0.9957274520849256, 0.46153846153846156)

Current size of match and non-match training data sets: 36 / 42

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 1.00
- Size 205 weight vectors
- Estimated match proportion 0.462

Sample size for this cluster: 65

Farthest first selection of 65 weight vectors from 205 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 5 matches and 60 non-matches
    Purity of oracle classification:  0.923
    Entropy of oracle classification: 0.391
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)314_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 314), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)314_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 823
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 823 weight vectors
  Containing 226 true matches and 597 true non-matches
    (27.46% true matches)
  Identified 766 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   729  (95.17%)
          2 :    34  (4.44%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 766 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 576

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 822
  Number of unique weight vectors: 766

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (766, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 766 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 766 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 681 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 153 matches and 528 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (528, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 528 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 528 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 5 matches and 67 non-matches
    Purity of oracle classification:  0.931
    Entropy of oracle classification: 0.364
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)285_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 285), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)285_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 869
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 869 weight vectors
  Containing 190 true matches and 679 true non-matches
    (21.86% true matches)
  Identified 829 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   795  (95.90%)
          2 :    31  (3.74%)
          3 :     2  (0.24%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 829 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.000 : 659

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 869
  Number of unique weight vectors: 829

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (829, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 829 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 829 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 27 matches and 59 non-matches
    Purity of oracle classification:  0.686
    Entropy of oracle classification: 0.898
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 743 weight vectors
  Based on 27 matches and 59 non-matches
  Classified 126 matches and 617 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (126, 0.686046511627907, 0.8976844934141643, 0.313953488372093)
    (617, 0.686046511627907, 0.8976844934141643, 0.313953488372093)

Current size of match and non-match training data sets: 27 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 126 weight vectors
- Estimated match proportion 0.314

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 126 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 49 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.141
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(15)631_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 631), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)631_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 653
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 653 weight vectors
  Containing 192 true matches and 461 true non-matches
    (29.40% true matches)
  Identified 632 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   618  (97.78%)
          2 :    11  (1.74%)
          3 :     2  (0.32%)
          7 :     1  (0.16%)

Identified 0 non-pure unique weight vectors (from 632 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.000 : 461

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 653
  Number of unique weight vectors: 632

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (632, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 632 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 632 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 549 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 137 matches and 412 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (412, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 412 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 412 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.864, 0.667, 0.435, 0.700, 0.600] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.500, 0.826, 0.429, 0.538, 0.636] (False)
    [1.000, 0.000, 0.846, 0.857, 0.353, 0.318, 0.400] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 6 matches and 65 non-matches
    Purity of oracle classification:  0.915
    Entropy of oracle classification: 0.418
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)186_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 186), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)186_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 221 true matches and 872 true non-matches
    (20.22% true matches)
  Identified 1037 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1037 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 851

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1037

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1037, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1037 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1037 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 949 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 949 non-matches

46.0
Analisando o arquivo: diverg(20)953_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 953), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)953_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 156 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (800, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 156 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)703_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 703), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)703_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 224
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 224 weight vectors
  Containing 176 true matches and 48 true non-matches
    (78.57% true matches)
  Identified 205 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   192  (93.66%)
          2 :    10  (4.88%)
          3 :     2  (0.98%)
          6 :     1  (0.49%)

Identified 0 non-pure unique weight vectors (from 205 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 157
     0.000 : 48

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 224
  Number of unique weight vectors: 205

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (205, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 205 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 65

Perform initial selection using "far" method

Farthest first selection of 65 weight vectors from 205 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 65 weight vectors
  The oracle will correctly classify 65 weight vectors and wrongly classify 0
  Classified 36 matches and 29 non-matches
    Purity of oracle classification:  0.554
    Entropy of oracle classification: 0.992
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  29
    Number of false non-matches: 0

Deleted 65 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 140 weight vectors
  Based on 36 matches and 29 non-matches
  Classified 136 matches and 4 non-matches

  Non-match cluster not large enough for required sample size
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 1
  Number of manual oracle classifications performed: 65
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.5538461538461539, 0.9916178297881032, 0.5538461538461539)

Current size of match and non-match training data sets: 36 / 29

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 136 weight vectors
- Estimated match proportion 0.554

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 42 matches and 14 non-matches
    Purity of oracle classification:  0.750
    Entropy of oracle classification: 0.811
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  14
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(10)408_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 408), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)408_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 732
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 732 weight vectors
  Containing 195 true matches and 537 true non-matches
    (26.64% true matches)
  Identified 690 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   655  (94.93%)
          2 :    32  (4.64%)
          3 :     2  (0.29%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 690 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 517

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 732
  Number of unique weight vectors: 690

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (690, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 690 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 690 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 606 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 143 matches and 463 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (463, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 463 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 463 vectors
  The selected farthest weight vectors are:
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.462, 0.609, 0.684, 0.308, 0.545] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 2 matches and 69 non-matches
    Purity of oracle classification:  0.972
    Entropy of oracle classification: 0.185
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(15)873_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 873), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)873_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1035
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1035 weight vectors
  Containing 223 true matches and 812 true non-matches
    (21.55% true matches)
  Identified 981 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   944  (96.23%)
          2 :    34  (3.47%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 981 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 791

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1034
  Number of unique weight vectors: 981

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (981, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 981 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 981 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 894 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 156 matches and 738 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (738, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 156 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)64_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 64), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)64_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)615_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 615), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)615_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 725
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 725 weight vectors
  Containing 209 true matches and 516 true non-matches
    (28.83% true matches)
  Identified 691 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   674  (97.54%)
          2 :    14  (2.03%)
          3 :     2  (0.29%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 691 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 513

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 724
  Number of unique weight vectors: 691

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (691, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 691 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 691 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 607 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 142 matches and 465 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (465, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 142 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 51 matches and 4 non-matches
    Purity of oracle classification:  0.927
    Entropy of oracle classification: 0.376
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)699_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 699), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)699_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 801
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 801 weight vectors
  Containing 220 true matches and 581 true non-matches
    (27.47% true matches)
  Identified 763 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   745  (97.64%)
          2 :    15  (1.97%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 763 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 578

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 800
  Number of unique weight vectors: 763

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (763, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 763 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 763 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 678 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 135 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 135 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 135 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)141_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (10, 1 - acm diverg, 141), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)141_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 569
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 569 weight vectors
  Containing 198 true matches and 371 true non-matches
    (34.80% true matches)
  Identified 537 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   523  (97.39%)
          2 :    11  (2.05%)
          3 :     2  (0.37%)
         18 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 537 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 370

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 568
  Number of unique weight vectors: 537

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (537, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 537 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 537 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 29 matches and 52 non-matches
    Purity of oracle classification:  0.642
    Entropy of oracle classification: 0.941
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 456 weight vectors
  Based on 29 matches and 52 non-matches
  Classified 142 matches and 314 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6419753086419753, 0.9410313090323237, 0.35802469135802467)
    (314, 0.6419753086419753, 0.9410313090323237, 0.35802469135802467)

Current size of match and non-match training data sets: 29 / 52

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 142 weight vectors
- Estimated match proportion 0.358

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 51 matches and 4 non-matches
    Purity of oracle classification:  0.927
    Entropy of oracle classification: 0.376
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)496_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 496), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)496_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1024
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1024 weight vectors
  Containing 198 true matches and 826 true non-matches
    (19.34% true matches)
  Identified 982 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   947  (96.44%)
          2 :    32  (3.26%)
          3 :     2  (0.20%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 982 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 806

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1024
  Number of unique weight vectors: 982

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (982, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 982 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 982 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 895 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 93 matches and 802 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (93, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (802, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 802 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 802 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)493_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 493), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)493_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 417
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 417 weight vectors
  Containing 200 true matches and 217 true non-matches
    (47.96% true matches)
  Identified 391 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   377  (96.42%)
          2 :    11  (2.81%)
          3 :     2  (0.51%)
         12 :     1  (0.26%)

Identified 1 non-pure unique weight vectors (from 391 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 216

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 416
  Number of unique weight vectors: 391

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (391, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 391 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 391 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 39 matches and 38 non-matches
    Purity of oracle classification:  0.506
    Entropy of oracle classification: 1.000
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 314 weight vectors
  Based on 39 matches and 38 non-matches
  Classified 133 matches and 181 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.5064935064935064, 0.9998783322990061, 0.5064935064935064)
    (181, 0.5064935064935064, 0.9998783322990061, 0.5064935064935064)

Current size of match and non-match training data sets: 39 / 38

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 181 weight vectors
- Estimated match proportion 0.506

Sample size for this cluster: 63

Farthest first selection of 63 weight vectors from 181 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.881, 1.000, 0.211, 0.250, 0.129, 0.250, 0.211] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.780, 1.000, 0.271, 0.152, 0.137, 0.250, 0.167] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.715, 1.000, 0.214, 0.125, 0.270, 0.214, 0.167] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.261, 0.174, 0.148, 0.186, 0.148] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.740, 1.000, 0.261, 0.273, 0.186, 0.171, 0.095] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [0.800, 1.000, 0.242, 0.121, 0.200, 0.171, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.442, 1.000, 0.235, 0.184, 0.120, 0.167, 0.185] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.931, 1.000, 0.250, 0.118, 0.200, 0.190, 0.308] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 63 weight vectors
  The oracle will correctly classify 63 weight vectors and wrongly classify 0
  Classified 8 matches and 55 non-matches
    Purity of oracle classification:  0.873
    Entropy of oracle classification: 0.549
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 63 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)687_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 687), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)687_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)611_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 611), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)611_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 441
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 441 weight vectors
  Containing 196 true matches and 245 true non-matches
    (44.44% true matches)
  Identified 417 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   400  (95.92%)
          2 :    14  (3.36%)
          3 :     2  (0.48%)
          7 :     1  (0.24%)

Identified 0 non-pure unique weight vectors (from 417 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 243

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 441
  Number of unique weight vectors: 417

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (417, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 417 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 417 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 36 matches and 42 non-matches
    Purity of oracle classification:  0.538
    Entropy of oracle classification: 0.996
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  42
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 339 weight vectors
  Based on 36 matches and 42 non-matches
  Classified 134 matches and 205 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (134, 0.5384615384615384, 0.9957274520849256, 0.46153846153846156)
    (205, 0.5384615384615384, 0.9957274520849256, 0.46153846153846156)

Current size of match and non-match training data sets: 36 / 42

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 1.00
- Size 134 weight vectors
- Estimated match proportion 0.462

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 134 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 51 matches and 5 non-matches
    Purity of oracle classification:  0.911
    Entropy of oracle classification: 0.434
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)676_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 676), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)676_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 666
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 666 weight vectors
  Containing 212 true matches and 454 true non-matches
    (31.83% true matches)
  Identified 614 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   578  (94.14%)
          2 :    33  (5.37%)
          3 :     2  (0.33%)
         16 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 614 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 433

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 665
  Number of unique weight vectors: 614

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (614, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 614 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 614 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 531 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 151 matches and 380 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (380, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 151 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 50 matches and 4 non-matches
    Purity of oracle classification:  0.926
    Entropy of oracle classification: 0.381
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)560_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 560), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)560_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 789
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 789 weight vectors
  Containing 225 true matches and 564 true non-matches
    (28.52% true matches)
  Identified 750 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   731  (97.47%)
          2 :    16  (2.13%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 750 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 561

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 788
  Number of unique weight vectors: 750

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (750, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 750 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 750 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 34 matches and 51 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 665 weight vectors
  Based on 34 matches and 51 non-matches
  Classified 153 matches and 512 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6, 0.9709505944546686, 0.4)
    (512, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 34 / 51

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 153 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 50 matches and 8 non-matches
    Purity of oracle classification:  0.862
    Entropy of oracle classification: 0.579
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)266_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.197324
f-measure              0.329609
da                           59
dm                            0
ndm                           0
tp                           59
fp                            0
tn                  4.76529e+07
fn                          240
Name: (10, 1 - acm diverg, 266), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)266_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 364
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 364 weight vectors
  Containing 193 true matches and 171 true non-matches
    (53.02% true matches)
  Identified 338 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   323  (95.56%)
          2 :    12  (3.55%)
          3 :     2  (0.59%)
         11 :     1  (0.30%)

Identified 1 non-pure unique weight vectors (from 338 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 169
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 168

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 363
  Number of unique weight vectors: 338

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (338, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 338 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 338 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 31 matches and 44 non-matches
    Purity of oracle classification:  0.587
    Entropy of oracle classification: 0.978
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 263 weight vectors
  Based on 31 matches and 44 non-matches
  Classified 140 matches and 123 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.5866666666666667, 0.9782176659354248, 0.41333333333333333)
    (123, 0.5866666666666667, 0.9782176659354248, 0.41333333333333333)

Current size of match and non-match training data sets: 31 / 44

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 123 weight vectors
- Estimated match proportion 0.413

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.800, 0.636, 0.563, 0.545, 0.722] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 4 matches and 49 non-matches
    Purity of oracle classification:  0.925
    Entropy of oracle classification: 0.386
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

59.0
Analisando o arquivo: diverg(10)300_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 300), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)300_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 591
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 591 weight vectors
  Containing 190 true matches and 401 true non-matches
    (32.15% true matches)
  Identified 544 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   511  (93.93%)
          2 :    30  (5.51%)
          3 :     2  (0.37%)
         14 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 544 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 163
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 380

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 590
  Number of unique weight vectors: 544

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (544, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 544 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 544 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 31 matches and 50 non-matches
    Purity of oracle classification:  0.617
    Entropy of oracle classification: 0.960
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 463 weight vectors
  Based on 31 matches and 50 non-matches
  Classified 155 matches and 308 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)
    (308, 0.6172839506172839, 0.9599377175669783, 0.38271604938271603)

Current size of match and non-match training data sets: 31 / 50

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 155 weight vectors
- Estimated match proportion 0.383

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 155 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 46 matches and 11 non-matches
    Purity of oracle classification:  0.807
    Entropy of oracle classification: 0.708
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

65.0
Analisando o arquivo: diverg(20)411_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 411), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)411_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 732
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 732 weight vectors
  Containing 219 true matches and 513 true non-matches
    (29.92% true matches)
  Identified 677 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   641  (94.68%)
          2 :    33  (4.87%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 677 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 492

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 731
  Number of unique weight vectors: 677

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (677, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 677 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 677 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 593 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 148 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (445, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 148 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 51 matches and 3 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)359_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (20, 1 - acm diverg, 359), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)359_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 201 true matches and 752 true non-matches
    (21.09% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.26%)
          2 :    31  (3.41%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 115 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (115, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 115 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 115 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 46 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)882_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 882), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)882_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 803
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 803 weight vectors
  Containing 208 true matches and 595 true non-matches
    (25.90% true matches)
  Identified 756 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   721  (95.37%)
          2 :    32  (4.23%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 756 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 574

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 802
  Number of unique weight vectors: 756

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (756, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 756 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 671 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 138 matches and 533 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (533, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 138 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 138 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 49 matches and 2 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.239
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)195_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 195), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)195_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 736
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 736 weight vectors
  Containing 221 true matches and 515 true non-matches
    (30.03% true matches)
  Identified 700 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   684  (97.71%)
          2 :    13  (1.86%)
          3 :     2  (0.29%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 700 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 514

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 735
  Number of unique weight vectors: 700

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (700, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 700 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 700 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 616 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 145 matches and 471 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (471, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 145 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)931_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 931), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)931_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)299_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 299), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)299_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 627
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 627 weight vectors
  Containing 196 true matches and 431 true non-matches
    (31.26% true matches)
  Identified 578 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   544  (94.12%)
          2 :    31  (5.36%)
          3 :     2  (0.35%)
         15 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 578 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 410

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 626
  Number of unique weight vectors: 578

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (578, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 578 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 578 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 30 matches and 52 non-matches
    Purity of oracle classification:  0.634
    Entropy of oracle classification: 0.947
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 496 weight vectors
  Based on 30 matches and 52 non-matches
  Classified 158 matches and 338 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6341463414634146, 0.9474351361840306, 0.36585365853658536)
    (338, 0.6341463414634146, 0.9474351361840306, 0.36585365853658536)

Current size of match and non-match training data sets: 30 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 158 weight vectors
- Estimated match proportion 0.366

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 158 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 47 matches and 10 non-matches
    Purity of oracle classification:  0.825
    Entropy of oracle classification: 0.670
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  10
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)296_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 296), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)296_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1059
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1059 weight vectors
  Containing 227 true matches and 832 true non-matches
    (21.44% true matches)
  Identified 1002 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   965  (96.31%)
          2 :    34  (3.39%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1002 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 811

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1058
  Number of unique weight vectors: 1002

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1002, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1002 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1002 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 915 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 161 matches and 754 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (161, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (754, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 161 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 161 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 47 matches and 8 non-matches
    Purity of oracle classification:  0.855
    Entropy of oracle classification: 0.598
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)680_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 680), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)680_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 644
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 644 weight vectors
  Containing 215 true matches and 429 true non-matches
    (33.39% true matches)
  Identified 592 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   556  (93.92%)
          2 :    33  (5.57%)
          3 :     2  (0.34%)
         16 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 592 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 408

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 643
  Number of unique weight vectors: 592

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (592, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 592 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 592 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 28 matches and 54 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 510 weight vectors
  Based on 28 matches and 54 non-matches
  Classified 146 matches and 364 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)
    (364, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)

Current size of match and non-match training data sets: 28 / 54

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 146 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 51 matches and 3 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)558_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.980198
recall                 0.331104
f-measure                 0.495
da                          101
dm                            0
ndm                           0
tp                           99
fp                            2
tn                  4.76529e+07
fn                          200
Name: (10, 1 - acm diverg, 558), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)558_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 585
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 585 weight vectors
  Containing 155 true matches and 430 true non-matches
    (26.50% true matches)
  Identified 551 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   521  (94.56%)
          2 :    27  (4.90%)
          3 :     2  (0.36%)
          4 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 551 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 141
     0.000 : 410

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 585
  Number of unique weight vectors: 551

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (551, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 551 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 551 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 469 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 130 matches and 339 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (339, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 130 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 130 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 42 matches and 12 non-matches
    Purity of oracle classification:  0.778
    Entropy of oracle classification: 0.764
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  12
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(20)610_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 610), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)610_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1068
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1068 weight vectors
  Containing 226 true matches and 842 true non-matches
    (21.16% true matches)
  Identified 1011 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   974  (96.34%)
          2 :    34  (3.36%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1011 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 821

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1067
  Number of unique weight vectors: 1011

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1011, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1011 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1011 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 924 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 151 matches and 773 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (773, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 151 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)785_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 785), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)785_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 528
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 528 weight vectors
  Containing 208 true matches and 320 true non-matches
    (39.39% true matches)
  Identified 499 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   482  (96.59%)
          2 :    14  (2.81%)
          3 :     2  (0.40%)
         12 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 499 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 317

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 527
  Number of unique weight vectors: 499

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (499, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 499 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 499 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 34 matches and 46 non-matches
    Purity of oracle classification:  0.575
    Entropy of oracle classification: 0.984
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 419 weight vectors
  Based on 34 matches and 46 non-matches
  Classified 143 matches and 276 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.575, 0.9837082626231857, 0.425)
    (276, 0.575, 0.9837082626231857, 0.425)

Current size of match and non-match training data sets: 34 / 46

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.98
- Size 143 weight vectors
- Estimated match proportion 0.425

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 143 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 52 matches and 5 non-matches
    Purity of oracle classification:  0.912
    Entropy of oracle classification: 0.429
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)797_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 797), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)797_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(10)589_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 589), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)589_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 960
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 960 weight vectors
  Containing 165 true matches and 795 true non-matches
    (17.19% true matches)
  Identified 923 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   892  (96.64%)
          2 :    28  (3.03%)
          3 :     2  (0.22%)
          6 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 923 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.000 : 775

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 960
  Number of unique weight vectors: 923

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (923, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 923 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 923 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 29 matches and 58 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 836 weight vectors
  Based on 29 matches and 58 non-matches
  Classified 262 matches and 574 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (262, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (574, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 29 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 262 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 262 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 41 matches and 23 non-matches
    Purity of oracle classification:  0.641
    Entropy of oracle classification: 0.942
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  23
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(10)842_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 842), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)842_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 294
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 294 weight vectors
  Containing 180 true matches and 114 true non-matches
    (61.22% true matches)
  Identified 273 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   258  (94.51%)
          2 :    12  (4.40%)
          3 :     2  (0.73%)
          6 :     1  (0.37%)

Identified 0 non-pure unique weight vectors (from 273 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 161
     0.000 : 112

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 294
  Number of unique weight vectors: 273

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (273, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 273 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Perform initial selection using "far" method

Farthest first selection of 71 weight vectors from 273 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 35 matches and 36 non-matches
    Purity of oracle classification:  0.507
    Entropy of oracle classification: 1.000
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  36
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 202 weight vectors
  Based on 35 matches and 36 non-matches
  Classified 139 matches and 63 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 71
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (139, 0.5070422535211268, 0.9998568991526107, 0.49295774647887325)
    (63, 0.5070422535211268, 0.9998568991526107, 0.49295774647887325)

Current size of match and non-match training data sets: 35 / 36

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 63 weight vectors
- Estimated match proportion 0.493

Sample size for this cluster: 38

Farthest first selection of 38 weight vectors from 63 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)

Perform oracle with 100.00 accuracy on 38 weight vectors
  The oracle will correctly classify 38 weight vectors and wrongly classify 0
  Classified 0 matches and 38 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 38 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(10)379_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 379), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)379_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 509
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 509 weight vectors
  Containing 189 true matches and 320 true non-matches
    (37.13% true matches)
  Identified 481 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   468  (97.30%)
          2 :    10  (2.08%)
          3 :     2  (0.42%)
         15 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 481 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 161
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 319

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 508
  Number of unique weight vectors: 481

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (481, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 481 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 481 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 28 matches and 52 non-matches
    Purity of oracle classification:  0.650
    Entropy of oracle classification: 0.934
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 401 weight vectors
  Based on 28 matches and 52 non-matches
  Classified 137 matches and 264 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.65, 0.934068055375491, 0.35)
    (264, 0.65, 0.934068055375491, 0.35)

Current size of match and non-match training data sets: 28 / 52

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 137 weight vectors
- Estimated match proportion 0.350

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 137 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 51 matches and 3 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)384_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 384), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)384_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)190_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 190), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)190_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1034
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1034 weight vectors
  Containing 188 true matches and 846 true non-matches
    (18.18% true matches)
  Identified 992 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.88%)
          2 :    28  (2.82%)
          3 :     2  (0.20%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 992 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 825

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1033
  Number of unique weight vectors: 992

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (992, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 992 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 992 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 905 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 77 matches and 828 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (77, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (828, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 77 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 38

Farthest first selection of 38 weight vectors from 77 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)

Perform oracle with 100.00 accuracy on 38 weight vectors
  The oracle will correctly classify 38 weight vectors and wrongly classify 0
  Classified 38 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 38 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)95_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 95), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)95_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 131 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)232_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 232), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)232_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 337
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 337 weight vectors
  Containing 189 true matches and 148 true non-matches
    (56.08% true matches)
  Identified 316 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   302  (95.57%)
          2 :    11  (3.48%)
          3 :     2  (0.63%)
          7 :     1  (0.32%)

Identified 0 non-pure unique weight vectors (from 316 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.000 : 148

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 337
  Number of unique weight vectors: 316

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (316, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 316 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 316 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.833, 1.000, 1.000, 0.935] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 52 matches and 22 non-matches
    Purity of oracle classification:  0.703
    Entropy of oracle classification: 0.878
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  22
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 242 weight vectors
  Based on 52 matches and 22 non-matches
  Classified 242 matches and 0 non-matches

68.0
Analisando o arquivo: diverg(20)387_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 387), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)387_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)863_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 863), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)863_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)427_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 427), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)427_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 945
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 945 weight vectors
  Containing 211 true matches and 734 true non-matches
    (22.33% true matches)
  Identified 892 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   857  (96.08%)
          2 :    32  (3.59%)
          3 :     2  (0.22%)
         18 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 892 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 713

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 944
  Number of unique weight vectors: 892

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (892, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 892 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 892 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 806 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 151 matches and 655 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (655, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 655 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 655 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 3 matches and 72 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)763_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 763), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)763_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1008
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1008 weight vectors
  Containing 223 true matches and 785 true non-matches
    (22.12% true matches)
  Identified 954 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   917  (96.12%)
          2 :    34  (3.56%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 954 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 764

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1007
  Number of unique weight vectors: 954

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (954, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 954 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 954 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 867 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 164 matches and 703 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (164, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (703, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 703 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 703 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 2 matches and 75 non-matches
    Purity of oracle classification:  0.974
    Entropy of oracle classification: 0.174
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)679_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 679), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)679_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)346_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 346), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)346_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 103 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 103 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 43 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)27_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 27), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)27_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)46_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 46), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)46_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 733
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 733 weight vectors
  Containing 221 true matches and 512 true non-matches
    (30.15% true matches)
  Identified 697 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   681  (97.70%)
          2 :    13  (1.87%)
          3 :     2  (0.29%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 697 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 511

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 732
  Number of unique weight vectors: 697

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (697, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 697 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 697 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 613 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 142 matches and 471 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (471, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 142 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 51 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.232
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)277_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 277), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)277_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 146 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (538, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 538 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 538 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 9 matches and 65 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.534
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)109_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 109), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)109_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 794
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 794 weight vectors
  Containing 209 true matches and 585 true non-matches
    (26.32% true matches)
  Identified 747 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   712  (95.31%)
          2 :    32  (4.28%)
          3 :     2  (0.27%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 747 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 564

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 793
  Number of unique weight vectors: 747

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (747, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 747 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 747 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 662 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 155 matches and 507 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (507, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 155 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 155 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 47 matches and 9 non-matches
    Purity of oracle classification:  0.839
    Entropy of oracle classification: 0.636
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)566_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 566), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)566_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 645
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 645 weight vectors
  Containing 211 true matches and 434 true non-matches
    (32.71% true matches)
  Identified 593 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   557  (93.93%)
          2 :    33  (5.56%)
          3 :     2  (0.34%)
         16 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 593 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 413

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 644
  Number of unique weight vectors: 593

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (593, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 593 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 593 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 511 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 171 matches and 340 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (171, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (340, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 340 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 340 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.367, 0.667, 0.583, 0.625, 0.316] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.438, 0.500, 0.467, 0.529, 0.611] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.818, 0.727, 0.438, 0.375, 0.400] (False)
    [0.857, 0.000, 0.500, 0.389, 0.235, 0.045, 0.526] (False)
    [1.000, 0.000, 0.476, 0.179, 0.500, 0.412, 0.357] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.833, 0.571, 0.727, 0.647, 0.857] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.583, 0.875, 0.727, 0.833, 0.643] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 0 matches and 72 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)19_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 19), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)19_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1064
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1064 weight vectors
  Containing 219 true matches and 845 true non-matches
    (20.58% true matches)
  Identified 1008 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   972  (96.43%)
          2 :    33  (3.27%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1008 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 824

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1063
  Number of unique weight vectors: 1008

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1008, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1008 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1008 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 921 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 326 matches and 595 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (326, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (595, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 595 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 595 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)689_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (10, 1 - acm diverg, 689), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)689_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 716
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 716 weight vectors
  Containing 142 true matches and 574 true non-matches
    (19.83% true matches)
  Identified 682 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   653  (95.75%)
          2 :    26  (3.81%)
          3 :     2  (0.29%)
          5 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 682 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 128
     0.000 : 554

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 716
  Number of unique weight vectors: 682

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (682, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 682 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 682 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 598 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 94 matches and 504 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (504, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 504 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 504 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 3 matches and 67 non-matches
    Purity of oracle classification:  0.957
    Entropy of oracle classification: 0.255
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(20)893_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 893), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)893_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 156 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (800, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 156 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)151_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 151), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)151_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1082
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1082 weight vectors
  Containing 209 true matches and 873 true non-matches
    (19.32% true matches)
  Identified 1035 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1000  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1035 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1081
  Number of unique weight vectors: 1035

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1035, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1035 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1035 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 947 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 95 matches and 852 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (95, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (852, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 95 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 95 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)386_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 386), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)386_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 757
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 757 weight vectors
  Containing 211 true matches and 546 true non-matches
    (27.87% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   699  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 543

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 756
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 33 matches and 51 non-matches
    Purity of oracle classification:  0.607
    Entropy of oracle classification: 0.967
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 33 matches and 51 non-matches
  Classified 303 matches and 331 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (303, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)
    (331, 0.6071428571428571, 0.9666186325481028, 0.39285714285714285)

Current size of match and non-match training data sets: 33 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.97
- Size 303 weight vectors
- Estimated match proportion 0.393

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 303 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.600, 0.944, 0.250, 0.200, 0.186, 0.136, 0.118] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.913, 1.000, 0.184, 0.175, 0.087, 0.233, 0.167] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 41 matches and 29 non-matches
    Purity of oracle classification:  0.586
    Entropy of oracle classification: 0.979
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  29
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)227_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990385
recall                 0.344482
f-measure              0.511166
da                          104
dm                            0
ndm                           0
tp                          103
fp                            1
tn                  4.76529e+07
fn                          196
Name: (10, 1 - acm diverg, 227), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)227_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 533
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 533 weight vectors
  Containing 151 true matches and 382 true non-matches
    (28.33% true matches)
  Identified 518 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   511  (98.65%)
          2 :     4  (0.77%)
          3 :     2  (0.39%)
          8 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 518 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 136
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 381

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 525
  Number of unique weight vectors: 517

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (517, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 517 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 517 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 27 matches and 54 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 436 weight vectors
  Based on 27 matches and 54 non-matches
  Classified 95 matches and 341 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (95, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (341, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 27 / 54

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 341 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 341 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.846, 0.542, 0.588, 0.579, 0.423] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 9 matches and 59 non-matches
    Purity of oracle classification:  0.868
    Entropy of oracle classification: 0.564
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

104.0
Analisando o arquivo: diverg(20)55_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 55), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)55_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1032
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1032 weight vectors
  Containing 212 true matches and 820 true non-matches
    (20.54% true matches)
  Identified 980 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   945  (96.43%)
          2 :    32  (3.27%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 980 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 799

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1031
  Number of unique weight vectors: 980

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (980, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 980 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 980 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 893 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 136 matches and 757 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (757, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 757 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 757 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 10 matches and 63 non-matches
    Purity of oracle classification:  0.863
    Entropy of oracle classification: 0.576
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)324_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 324), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)324_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 695
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 695 weight vectors
  Containing 200 true matches and 495 true non-matches
    (28.78% true matches)
  Identified 650 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   616  (94.77%)
          2 :    31  (4.77%)
          3 :     2  (0.31%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 650 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 474

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 694
  Number of unique weight vectors: 650

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (650, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 650 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 650 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 567 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 156 matches and 411 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (411, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 411 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 411 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [1.000, 0.000, 0.700, 0.429, 0.476, 0.647, 0.810] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 1 matches and 70 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.107
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)284_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 284), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)284_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 754
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 754 weight vectors
  Containing 222 true matches and 532 true non-matches
    (29.44% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   699  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 529

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 753
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 135 matches and 499 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (499, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 135 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 135 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)950_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (15, 1 - acm diverg, 950), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)950_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 997
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 997 weight vectors
  Containing 170 true matches and 827 true non-matches
    (17.05% true matches)
  Identified 960 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   929  (96.77%)
          2 :    28  (2.92%)
          3 :     2  (0.21%)
          6 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 960 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 153
     0.000 : 807

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 997
  Number of unique weight vectors: 960

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (960, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 960 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 960 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 873 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 42 matches and 831 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (42, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (831, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 831 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 831 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 16 matches and 56 non-matches
    Purity of oracle classification:  0.778
    Entropy of oracle classification: 0.764
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(20)505_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (20, 1 - acm diverg, 505), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)505_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 920
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 920 weight vectors
  Containing 215 true matches and 705 true non-matches
    (23.37% true matches)
  Identified 868 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   832  (95.85%)
          2 :    33  (3.80%)
          3 :     2  (0.23%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 868 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 684

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 919
  Number of unique weight vectors: 868

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (868, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 868 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 868 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 782 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 158 matches and 624 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (624, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 158 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 158 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)11_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 11), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)11_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 376
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 376 weight vectors
  Containing 194 true matches and 182 true non-matches
    (51.60% true matches)
  Identified 355 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   341  (96.06%)
          2 :    11  (3.10%)
          3 :     2  (0.56%)
          7 :     1  (0.28%)

Identified 0 non-pure unique weight vectors (from 355 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 182

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 376
  Number of unique weight vectors: 355

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (355, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 355 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 355 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 41 matches and 34 non-matches
    Purity of oracle classification:  0.547
    Entropy of oracle classification: 0.994
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  34
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 280 weight vectors
  Based on 41 matches and 34 non-matches
  Classified 131 matches and 149 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.5466666666666666, 0.993707106604508, 0.5466666666666666)
    (149, 0.5466666666666666, 0.993707106604508, 0.5466666666666666)

Current size of match and non-match training data sets: 41 / 34

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 131 weight vectors
- Estimated match proportion 0.547

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 49 matches and 6 non-matches
    Purity of oracle classification:  0.891
    Entropy of oracle classification: 0.497
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)101_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 101), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)101_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 241
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 241 weight vectors
  Containing 184 true matches and 57 true non-matches
    (76.35% true matches)
  Identified 222 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   209  (94.14%)
          2 :    10  (4.50%)
          3 :     2  (0.90%)
          6 :     1  (0.45%)

Identified 0 non-pure unique weight vectors (from 222 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.000 : 57

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 241
  Number of unique weight vectors: 222

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (222, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 222 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 67

Perform initial selection using "far" method

Farthest first selection of 67 weight vectors from 222 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 34 matches and 33 non-matches
    Purity of oracle classification:  0.507
    Entropy of oracle classification: 1.000
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  33
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 155 weight vectors
  Based on 34 matches and 33 non-matches
  Classified 136 matches and 19 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 67
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.5074626865671642, 0.9998393017810485, 0.5074626865671642)
    (19, 0.5074626865671642, 0.9998393017810485, 0.5074626865671642)

Current size of match and non-match training data sets: 34 / 33

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 136 weight vectors
- Estimated match proportion 0.507

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(10)74_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 74), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)74_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 480
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 480 weight vectors
  Containing 154 true matches and 326 true non-matches
    (32.08% true matches)
  Identified 467 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   458  (98.07%)
          2 :     6  (1.28%)
          3 :     2  (0.43%)
          4 :     1  (0.21%)

Identified 0 non-pure unique weight vectors (from 467 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 141
     0.000 : 326

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 480
  Number of unique weight vectors: 467

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (467, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 467 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 467 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 28 matches and 51 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.938
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 388 weight vectors
  Based on 28 matches and 51 non-matches
  Classified 114 matches and 274 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6455696202531646, 0.9379626436434423, 0.35443037974683544)
    (274, 0.6455696202531646, 0.9379626436434423, 0.35443037974683544)

Current size of match and non-match training data sets: 28 / 51

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 114 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 114 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 42 matches and 8 non-matches
    Purity of oracle classification:  0.840
    Entropy of oracle classification: 0.634
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(15)511_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 511), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)511_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 835
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 835 weight vectors
  Containing 208 true matches and 627 true non-matches
    (24.91% true matches)
  Identified 788 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   753  (95.56%)
          2 :    32  (4.06%)
          3 :     2  (0.25%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 788 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 606

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 834
  Number of unique weight vectors: 788

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (788, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 788 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 788 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.722, 0.471, 0.545, 0.579] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.800, 0.000, 0.556, 0.182, 0.500, 0.071, 0.400] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.344, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.300, 0.524, 0.727, 0.762] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 24 matches and 61 non-matches
    Purity of oracle classification:  0.718
    Entropy of oracle classification: 0.859
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 703 weight vectors
  Based on 24 matches and 61 non-matches
  Classified 148 matches and 555 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)
    (555, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)

Current size of match and non-match training data sets: 24 / 61

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 555 weight vectors
- Estimated match proportion 0.282

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 555 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.800, 0.000, 0.625, 0.571, 0.467, 0.474, 0.667] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.462, 0.667, 0.636, 0.368, 0.500] (False)
    [1.000, 0.000, 0.313, 0.435, 0.467, 0.600, 0.611] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.808, 0.478, 0.636, 0.786, 0.500] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.783, 0.357, 0.750, 0.412, 0.238] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.857, 0.111, 0.444, 0.529, 0.500] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [1.000, 0.000, 0.571, 0.357, 0.632, 0.571, 0.833] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 6 matches and 62 non-matches
    Purity of oracle classification:  0.912
    Entropy of oracle classification: 0.431
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)180_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 180), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)180_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)741_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 741), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)741_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)840_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 840), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)840_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 786
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 786 weight vectors
  Containing 208 true matches and 578 true non-matches
    (26.46% true matches)
  Identified 757 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   740  (97.75%)
          2 :    14  (1.85%)
          3 :     2  (0.26%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 757 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 575

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 785
  Number of unique weight vectors: 757

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (757, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 757 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 757 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.233, 0.484, 0.579, 0.455, 0.714] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 672 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 145 matches and 527 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (527, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 145 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 51 matches and 4 non-matches
    Purity of oracle classification:  0.927
    Entropy of oracle classification: 0.376
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)874_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 874), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)874_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 845
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 845 weight vectors
  Containing 227 true matches and 618 true non-matches
    (26.86% true matches)
  Identified 788 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   751  (95.30%)
          2 :    34  (4.31%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 788 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 597

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 844
  Number of unique weight vectors: 788

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (788, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 788 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 788 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.722, 0.471, 0.545, 0.579] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 0.963, 1.000, 1.000] (True)
    [0.800, 0.000, 0.556, 0.182, 0.500, 0.071, 0.400] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.344, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.300, 0.524, 0.727, 0.762] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 26 matches and 59 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.888
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 703 weight vectors
  Based on 26 matches and 59 non-matches
  Classified 142 matches and 561 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)
    (561, 0.6941176470588235, 0.8883630233845602, 0.3058823529411765)

Current size of match and non-match training data sets: 26 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 142 weight vectors
- Estimated match proportion 0.306

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)175_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (15, 1 - acm diverg, 175), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)175_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 982
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 982 weight vectors
  Containing 168 true matches and 814 true non-matches
    (17.11% true matches)
  Identified 945 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   914  (96.72%)
          2 :    28  (2.96%)
          3 :     2  (0.21%)
          6 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 945 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 151
     0.000 : 794

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 982
  Number of unique weight vectors: 945

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (945, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 945 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 945 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 858 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 30 matches and 828 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (30, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (828, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 30 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 22

Farthest first selection of 22 weight vectors from 30 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 0.833, 1.000, 1.000, 0.935] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.971, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)

Perform oracle with 100.00 accuracy on 22 weight vectors
  The oracle will correctly classify 22 weight vectors and wrongly classify 0
  Classified 22 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      22
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 22 weight vectors (classified by oracle) from cluster

Cluster is pure enough and not too large, add its 30 weight vectors to:
  Match training set

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 3: Queue length: 1
  Number of manual oracle classifications performed: 109
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (828, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 54 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 828 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 828 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.942, 1.000, 0.156, 0.172, 0.189, 0.148, 0.133] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 18 matches and 52 non-matches
    Purity of oracle classification:  0.743
    Entropy of oracle classification: 0.822
    Number of true matches:      18
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(20)766_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 766), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)766_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 224 true matches and 575 true non-matches
    (28.04% true matches)
  Identified 760 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   741  (97.50%)
          2 :    16  (2.11%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 760 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 572

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 760

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (760, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 760 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 675 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 145 matches and 530 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (530, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 145 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 52 matches and 2 non-matches
    Purity of oracle classification:  0.963
    Entropy of oracle classification: 0.229
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)240_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 240), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)240_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 515
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 515 weight vectors
  Containing 222 true matches and 293 true non-matches
    (43.11% true matches)
  Identified 476 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   457  (96.01%)
          2 :    16  (3.36%)
          3 :     2  (0.42%)
         20 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 476 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 290

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 514
  Number of unique weight vectors: 476

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (476, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 476 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 476 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 36 matches and 44 non-matches
    Purity of oracle classification:  0.550
    Entropy of oracle classification: 0.993
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 396 weight vectors
  Based on 36 matches and 44 non-matches
  Classified 314 matches and 82 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (314, 0.55, 0.9927744539878084, 0.45)
    (82, 0.55, 0.9927744539878084, 0.45)

Current size of match and non-match training data sets: 36 / 44

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 314 weight vectors
- Estimated match proportion 0.450

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 314 vectors
  The selected farthest weight vectors are:
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 45 matches and 28 non-matches
    Purity of oracle classification:  0.616
    Entropy of oracle classification: 0.961
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)781_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 781), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)781_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 0 matches and 829 non-matches

40.0
Analisando o arquivo: diverg(20)885_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 885), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)885_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 407
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 407 weight vectors
  Containing 217 true matches and 190 true non-matches
    (53.32% true matches)
  Identified 370 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   352  (95.14%)
          2 :    15  (4.05%)
          3 :     2  (0.54%)
         19 :     1  (0.27%)

Identified 1 non-pure unique weight vectors (from 370 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 187

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 406
  Number of unique weight vectors: 370

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (370, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 370 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 370 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 29 matches and 47 non-matches
    Purity of oracle classification:  0.618
    Entropy of oracle classification: 0.959
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 294 weight vectors
  Based on 29 matches and 47 non-matches
  Classified 145 matches and 149 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.618421052631579, 0.959149554396894, 0.3815789473684211)
    (149, 0.618421052631579, 0.959149554396894, 0.3815789473684211)

Current size of match and non-match training data sets: 29 / 47

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 149 weight vectors
- Estimated match proportion 0.382

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 9 matches and 48 non-matches
    Purity of oracle classification:  0.842
    Entropy of oracle classification: 0.629
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)5_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 5), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)5_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 542
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 542 weight vectors
  Containing 181 true matches and 361 true non-matches
    (33.39% true matches)
  Identified 514 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   500  (97.28%)
          2 :    11  (2.14%)
          3 :     2  (0.39%)
         14 :     1  (0.19%)

Identified 1 non-pure unique weight vectors (from 514 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 155
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 358

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 541
  Number of unique weight vectors: 514

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (514, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 514 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 514 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.667, 0.400, 0.583, 0.563] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 32 matches and 49 non-matches
    Purity of oracle classification:  0.605
    Entropy of oracle classification: 0.968
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 433 weight vectors
  Based on 32 matches and 49 non-matches
  Classified 127 matches and 306 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (127, 0.6049382716049383, 0.9679884922470297, 0.3950617283950617)
    (306, 0.6049382716049383, 0.9679884922470297, 0.3950617283950617)

Current size of match and non-match training data sets: 32 / 49

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 127 weight vectors
- Estimated match proportion 0.395

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 127 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

65.0
Analisando o arquivo: diverg(20)415_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 415), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)415_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 541
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 541 weight vectors
  Containing 220 true matches and 321 true non-matches
    (40.67% true matches)
  Identified 503 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   485  (96.42%)
          2 :    15  (2.98%)
          3 :     2  (0.40%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 503 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 318

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 540
  Number of unique weight vectors: 503

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (503, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 503 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 503 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 32 matches and 48 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 423 weight vectors
  Based on 32 matches and 48 non-matches
  Classified 142 matches and 281 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6, 0.9709505944546686, 0.4)
    (281, 0.6, 0.9709505944546686, 0.4)

Current size of match and non-match training data sets: 32 / 48

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 281 weight vectors
- Estimated match proportion 0.400

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 281 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 6 matches and 63 non-matches
    Purity of oracle classification:  0.913
    Entropy of oracle classification: 0.426
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)913_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 913), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)913_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 810
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 810 weight vectors
  Containing 223 true matches and 587 true non-matches
    (27.53% true matches)
  Identified 756 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   719  (95.11%)
          2 :    34  (4.50%)
          3 :     2  (0.26%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 756 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 566

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 809
  Number of unique weight vectors: 756

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (756, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 756 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 671 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 94 matches and 577 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (577, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 577 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 577 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 20 matches and 53 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      20
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)765_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 765), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)765_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1081
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1081 weight vectors
  Containing 209 true matches and 872 true non-matches
    (19.33% true matches)
  Identified 1034 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1034 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 851

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1080
  Number of unique weight vectors: 1034

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1034, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1034 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1034 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 946 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 845 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (845, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)895_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (15, 1 - acm diverg, 895), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)895_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 897
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 897 weight vectors
  Containing 155 true matches and 742 true non-matches
    (17.28% true matches)
  Identified 861 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   833  (96.75%)
          2 :    25  (2.90%)
          3 :     2  (0.23%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 861 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 139
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 721

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 889
  Number of unique weight vectors: 860

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (860, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 860 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 860 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 774 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 87 matches and 687 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (87, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (687, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 687 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 687 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 13 matches and 58 non-matches
    Purity of oracle classification:  0.817
    Entropy of oracle classification: 0.687
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)686_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.982759
recall                 0.190635
f-measure              0.319328
da                           58
dm                            0
ndm                           0
tp                           57
fp                            1
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 686), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)686_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 665
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 665 weight vectors
  Containing 201 true matches and 464 true non-matches
    (30.23% true matches)
  Identified 615 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   581  (94.47%)
          2 :    31  (5.04%)
          3 :     2  (0.33%)
         16 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 615 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 443

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 664
  Number of unique weight vectors: 615

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (615, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 615 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 615 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 532 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 152 matches and 380 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (380, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 152 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 152 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 45 matches and 11 non-matches
    Purity of oracle classification:  0.804
    Entropy of oracle classification: 0.715
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)843_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 843), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)843_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 799
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 799 weight vectors
  Containing 224 true matches and 575 true non-matches
    (28.04% true matches)
  Identified 760 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   741  (97.50%)
          2 :    16  (2.11%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 760 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 572

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 798
  Number of unique weight vectors: 760

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (760, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 760 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 675 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 149 matches and 526 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (526, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 526 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 526 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 7 matches and 67 non-matches
    Purity of oracle classification:  0.905
    Entropy of oracle classification: 0.452
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)435_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 435), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)435_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 202 true matches and 660 true non-matches
    (23.43% true matches)
  Identified 813 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   779  (95.82%)
          2 :    31  (3.81%)
          3 :     2  (0.25%)
         15 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 813 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 639

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 813

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (813, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 813 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 813 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 727 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 169 matches and 558 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (169, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (558, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 169 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 169 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 43 matches and 14 non-matches
    Purity of oracle classification:  0.754
    Entropy of oracle classification: 0.804
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  14
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)405_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 405), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)405_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1065
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1065 weight vectors
  Containing 209 true matches and 856 true non-matches
    (19.62% true matches)
  Identified 1018 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   983  (96.56%)
          2 :    32  (3.14%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1018 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 835

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1064
  Number of unique weight vectors: 1018

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1018, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1018 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1018 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 931 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 232 matches and 699 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (232, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (699, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 232 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 63

Farthest first selection of 63 weight vectors from 232 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 63 weight vectors
  The oracle will correctly classify 63 weight vectors and wrongly classify 0
  Classified 43 matches and 20 non-matches
    Purity of oracle classification:  0.683
    Entropy of oracle classification: 0.902
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  20
    Number of false non-matches: 0

Deleted 63 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)417_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 417), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)417_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 844
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 844 weight vectors
  Containing 226 true matches and 618 true non-matches
    (26.78% true matches)
  Identified 787 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (95.30%)
          2 :    34  (4.32%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 787 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 597

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 843
  Number of unique weight vectors: 787

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (787, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 787 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 787 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 702 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 160 matches and 542 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (160, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (542, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 542 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 542 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 4 matches and 69 non-matches
    Purity of oracle classification:  0.945
    Entropy of oracle classification: 0.306
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)705_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 705), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)705_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1022
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1022 weight vectors
  Containing 207 true matches and 815 true non-matches
    (20.25% true matches)
  Identified 975 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   940  (96.41%)
          2 :    32  (3.28%)
          3 :     2  (0.21%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 975 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 794

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1021
  Number of unique weight vectors: 975

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (975, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 975 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 975 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 888 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 319 matches and 569 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (319, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (569, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 319 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 319 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.909, 1.000, 1.000, 1.000, 0.947] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 41 matches and 28 non-matches
    Purity of oracle classification:  0.594
    Entropy of oracle classification: 0.974
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)477_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 477), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)477_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)439_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 439), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)439_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1027
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1027 weight vectors
  Containing 223 true matches and 804 true non-matches
    (21.71% true matches)
  Identified 973 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   936  (96.20%)
          2 :    34  (3.49%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 973 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 783

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 973

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (973, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 973 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 973 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 886 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 131 matches and 755 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (755, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 755 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 755 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)237_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 237), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)237_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 729
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 729 weight vectors
  Containing 221 true matches and 508 true non-matches
    (30.32% true matches)
  Identified 693 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   677  (97.69%)
          2 :    13  (1.88%)
          3 :     2  (0.29%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 693 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 507

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 728
  Number of unique weight vectors: 693

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (693, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 693 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 693 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 609 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 145 matches and 464 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (464, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 145 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)155_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976562
recall                  0.41806
f-measure               0.58548
da                          128
dm                            0
ndm                           0
tp                          125
fp                            3
tn                  4.76529e+07
fn                          174
Name: (10, 1 - acm diverg, 155), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)155_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 155
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 155 weight vectors
  Containing 119 true matches and 36 true non-matches
    (76.77% true matches)
  Identified 145 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   138  (95.17%)
          2 :     4  (2.76%)
          3 :     3  (2.07%)

Identified 0 non-pure unique weight vectors (from 145 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 109
     0.000 : 36

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 155
  Number of unique weight vectors: 145

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 145 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 58

Perform initial selection using "far" method

Farthest first selection of 58 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 35 matches and 23 non-matches
    Purity of oracle classification:  0.603
    Entropy of oracle classification: 0.969
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  23
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 87 weight vectors
  Based on 35 matches and 23 non-matches
  Classified 87 matches and 0 non-matches

128.0
Analisando o arquivo: diverg(20)796_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 796), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)796_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 27 matches and 61 non-matches
    Purity of oracle classification:  0.693
    Entropy of oracle classification: 0.889
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 27 matches and 61 non-matches
  Classified 148 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6931818181818182, 0.8894663896628687, 0.3068181818181818)
    (800, 0.6931818181818182, 0.8894663896628687, 0.3068181818181818)

Current size of match and non-match training data sets: 27 / 61

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 148 weight vectors
- Estimated match proportion 0.307

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)81_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 81), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)81_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 830
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 830 weight vectors
  Containing 227 true matches and 603 true non-matches
    (27.35% true matches)
  Identified 773 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   736  (95.21%)
          2 :    34  (4.40%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 773 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 582

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 829
  Number of unique weight vectors: 773

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (773, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 773 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 773 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 688 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 151 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (537, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 537 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 537 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 9 matches and 64 non-matches
    Purity of oracle classification:  0.877
    Entropy of oracle classification: 0.539
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)620_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 620), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)620_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 830
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 830 weight vectors
  Containing 227 true matches and 603 true non-matches
    (27.35% true matches)
  Identified 773 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   736  (95.21%)
          2 :    34  (4.40%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 773 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 582

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 829
  Number of unique weight vectors: 773

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (773, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 773 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 773 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 688 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 151 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (537, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 537 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 537 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 9 matches and 64 non-matches
    Purity of oracle classification:  0.877
    Entropy of oracle classification: 0.539
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)403_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 403), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)403_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 353
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 353 weight vectors
  Containing 172 true matches and 181 true non-matches
    (48.73% true matches)
  Identified 332 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   320  (96.39%)
          2 :     9  (2.71%)
          3 :     2  (0.60%)
          9 :     1  (0.30%)

Identified 1 non-pure unique weight vectors (from 332 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 153
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 178

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 344
  Number of unique weight vectors: 331

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (331, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 331 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 331 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 31 matches and 43 non-matches
    Purity of oracle classification:  0.581
    Entropy of oracle classification: 0.981
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 257 weight vectors
  Based on 31 matches and 43 non-matches
  Classified 125 matches and 132 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 74
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (125, 0.581081081081081, 0.9809470132751208, 0.4189189189189189)
    (132, 0.581081081081081, 0.9809470132751208, 0.4189189189189189)

Current size of match and non-match training data sets: 31 / 43

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 125 weight vectors
- Estimated match proportion 0.419

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 125 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 48 matches and 6 non-matches
    Purity of oracle classification:  0.889
    Entropy of oracle classification: 0.503
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(20)884_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 884), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)884_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 131 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)717_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 717), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)717_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1061
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1061 weight vectors
  Containing 188 true matches and 873 true non-matches
    (17.72% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   988  (96.96%)
          2 :    28  (2.75%)
          3 :     2  (0.20%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1060
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 0 matches and 932 non-matches

79.0
Analisando o arquivo: diverg(15)458_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 458), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)458_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 445
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 445 weight vectors
  Containing 203 true matches and 242 true non-matches
    (45.62% true matches)
  Identified 419 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   405  (96.66%)
          2 :    11  (2.63%)
          3 :     2  (0.48%)
         12 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 419 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 241

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 444
  Number of unique weight vectors: 419

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (419, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 419 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 419 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 34 matches and 44 non-matches
    Purity of oracle classification:  0.564
    Entropy of oracle classification: 0.988
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 341 weight vectors
  Based on 34 matches and 44 non-matches
  Classified 139 matches and 202 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (139, 0.5641025641025641, 0.9881108365218301, 0.4358974358974359)
    (202, 0.5641025641025641, 0.9881108365218301, 0.4358974358974359)

Current size of match and non-match training data sets: 34 / 44

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 202 weight vectors
- Estimated match proportion 0.436

Sample size for this cluster: 64

Farthest first selection of 64 weight vectors from 202 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.929, 1.000, 0.182, 0.238, 0.188, 0.146, 0.270] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)

Perform oracle with 100.00 accuracy on 64 weight vectors
  The oracle will correctly classify 64 weight vectors and wrongly classify 0
  Classified 7 matches and 57 non-matches
    Purity of oracle classification:  0.891
    Entropy of oracle classification: 0.498
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 64 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)312_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (10, 1 - acm diverg, 312), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)312_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 323
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 323 weight vectors
  Containing 207 true matches and 116 true non-matches
    (64.09% true matches)
  Identified 291 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   277  (95.19%)
          2 :    11  (3.78%)
          3 :     2  (0.69%)
         18 :     1  (0.34%)

Identified 1 non-pure unique weight vectors (from 291 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 115

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 322
  Number of unique weight vectors: 291

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (291, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 291 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 72

Perform initial selection using "far" method

Farthest first selection of 72 weight vectors from 291 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 33 matches and 39 non-matches
    Purity of oracle classification:  0.542
    Entropy of oracle classification: 0.995
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 219 weight vectors
  Based on 33 matches and 39 non-matches
  Classified 146 matches and 73 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 72
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.5416666666666666, 0.9949848281859701, 0.4583333333333333)
    (73, 0.5416666666666666, 0.9949848281859701, 0.4583333333333333)

Current size of match and non-match training data sets: 33 / 39

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 0.99
- Size 73 weight vectors
- Estimated match proportion 0.458

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 73 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 4 matches and 38 non-matches
    Purity of oracle classification:  0.905
    Entropy of oracle classification: 0.454
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)644_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 644), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)644_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1053
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1053 weight vectors
  Containing 208 true matches and 845 true non-matches
    (19.75% true matches)
  Identified 1006 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   971  (96.52%)
          2 :    32  (3.18%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1006 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 824

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1052
  Number of unique weight vectors: 1006

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1006, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1006 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1006 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 919 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 152 matches and 767 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (767, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 767 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 767 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.667, 0.500, 0.647, 0.556, 0.684] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.750, 0.429, 0.526, 0.500, 0.846] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.583, 0.444, 0.412, 0.318, 0.421] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.233, 0.545, 0.714, 0.455, 0.238] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.462, 0.889, 0.455, 0.211, 0.375] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 3 matches and 72 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)899_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987952
recall                 0.274247
f-measure              0.429319
da                           83
dm                            0
ndm                           0
tp                           82
fp                            1
tn                  4.76529e+07
fn                          217
Name: (10, 1 - acm diverg, 899), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)899_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 864
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 864 weight vectors
  Containing 169 true matches and 695 true non-matches
    (19.56% true matches)
  Identified 825 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   795  (96.36%)
          2 :    27  (3.27%)
          3 :     2  (0.24%)
          9 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 825 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 150
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 674

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 855
  Number of unique weight vectors: 824

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (824, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 824 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 824 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 31 matches and 55 non-matches
    Purity of oracle classification:  0.640
    Entropy of oracle classification: 0.943
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 738 weight vectors
  Based on 31 matches and 55 non-matches
  Classified 169 matches and 569 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (169, 0.6395348837209303, 0.9430685934712908, 0.36046511627906974)
    (569, 0.6395348837209303, 0.9430685934712908, 0.36046511627906974)

Current size of match and non-match training data sets: 31 / 55

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 569 weight vectors
- Estimated match proportion 0.360

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 569 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.538, 0.789, 0.353, 0.545, 0.550] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.444, 0.643, 0.421, 0.200, 0.556] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [1.000, 0.000, 0.350, 0.455, 0.625, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.667, 0.444, 0.556, 0.222, 0.143] (False)
    [1.000, 0.000, 0.583, 0.389, 0.471, 0.545, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

83.0
Analisando o arquivo: diverg(10)979_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 979), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)979_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 868
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 868 weight vectors
  Containing 190 true matches and 678 true non-matches
    (21.89% true matches)
  Identified 828 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   794  (95.89%)
          2 :    31  (3.74%)
          3 :     2  (0.24%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 828 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.000 : 658

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 868
  Number of unique weight vectors: 828

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (828, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 828 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 828 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 27 matches and 59 non-matches
    Purity of oracle classification:  0.686
    Entropy of oracle classification: 0.898
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 742 weight vectors
  Based on 27 matches and 59 non-matches
  Classified 126 matches and 616 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (126, 0.686046511627907, 0.8976844934141643, 0.313953488372093)
    (616, 0.686046511627907, 0.8976844934141643, 0.313953488372093)

Current size of match and non-match training data sets: 27 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 126 weight vectors
- Estimated match proportion 0.314

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 126 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 49 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.141
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(10)264_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 264), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)264_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 272
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 272 weight vectors
  Containing 171 true matches and 101 true non-matches
    (62.87% true matches)
  Identified 251 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   236  (94.02%)
          2 :    12  (4.78%)
          3 :     2  (0.80%)
          6 :     1  (0.40%)

Identified 0 non-pure unique weight vectors (from 251 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.000 : 99

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 272
  Number of unique weight vectors: 251

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (251, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 251 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 69

Perform initial selection using "far" method

Farthest first selection of 69 weight vectors from 251 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 33 matches and 36 non-matches
    Purity of oracle classification:  0.522
    Entropy of oracle classification: 0.999
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  36
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 182 weight vectors
  Based on 33 matches and 36 non-matches
  Classified 126 matches and 56 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 69
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (126, 0.5217391304347826, 0.9986359641585718, 0.4782608695652174)
    (56, 0.5217391304347826, 0.9986359641585718, 0.4782608695652174)

Current size of match and non-match training data sets: 33 / 36

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 56 weight vectors
- Estimated match proportion 0.478

Sample size for this cluster: 36

Farthest first selection of 36 weight vectors from 56 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.367, 0.667, 0.583, 0.625, 0.316] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)

Perform oracle with 100.00 accuracy on 36 weight vectors
  The oracle will correctly classify 36 weight vectors and wrongly classify 0
  Classified 0 matches and 36 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  36
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 36 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(15)33_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 33), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)33_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 665
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 665 weight vectors
  Containing 217 true matches and 448 true non-matches
    (32.63% true matches)
  Identified 628 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   610  (97.13%)
          2 :    15  (2.39%)
          3 :     2  (0.32%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 628 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 445

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 664
  Number of unique weight vectors: 628

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (628, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 628 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 628 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 545 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 135 matches and 410 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (410, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 135 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 135 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)905_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (10, 1 - acm diverg, 905), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)905_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 951
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 951 weight vectors
  Containing 210 true matches and 741 true non-matches
    (22.08% true matches)
  Identified 898 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   863  (96.10%)
          2 :    32  (3.56%)
          3 :     2  (0.22%)
         18 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 898 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 720

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 950
  Number of unique weight vectors: 898

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (898, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 898 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 898 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 23 matches and 63 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.838
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 812 weight vectors
  Based on 23 matches and 63 non-matches
  Classified 0 matches and 812 non-matches

48.0
Analisando o arquivo: diverg(20)361_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 361), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)361_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)915_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 915), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)915_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 245
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 245 weight vectors
  Containing 205 true matches and 40 true non-matches
    (83.67% true matches)
  Identified 215 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   201  (93.49%)
          2 :    11  (5.12%)
          3 :     2  (0.93%)
         16 :     1  (0.47%)

Identified 1 non-pure unique weight vectors (from 215 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 39

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 244
  Number of unique weight vectors: 215

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (215, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 215 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 66

Perform initial selection using "far" method

Farthest first selection of 66 weight vectors from 215 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 40 matches and 26 non-matches
    Purity of oracle classification:  0.606
    Entropy of oracle classification: 0.967
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  26
    Number of false non-matches: 0

Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 149 weight vectors
  Based on 40 matches and 26 non-matches
  Classified 149 matches and 0 non-matches

43.0
Analisando o arquivo: diverg(15)57_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 57), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)57_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1058
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1058 weight vectors
  Containing 226 true matches and 832 true non-matches
    (21.36% true matches)
  Identified 1001 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   964  (96.30%)
          2 :    34  (3.40%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1001 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 811

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1057
  Number of unique weight vectors: 1001

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1001, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1001 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1001 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 914 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 160 matches and 754 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (160, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (754, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 754 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 754 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 4 matches and 71 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.300
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)471_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 471), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)471_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)174_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 174), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)174_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 998
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 998 weight vectors
  Containing 162 true matches and 836 true non-matches
    (16.23% true matches)
  Identified 959 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   930  (96.98%)
          2 :    26  (2.71%)
          3 :     2  (0.21%)
         10 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 959 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 143
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 815

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 997
  Number of unique weight vectors: 959

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (959, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 959 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 959 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 872 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 83 matches and 789 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (83, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (789, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 789 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 789 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 13 matches and 57 non-matches
    Purity of oracle classification:  0.814
    Entropy of oracle classification: 0.692
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(15)966_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 966), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)966_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 728
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 728 weight vectors
  Containing 197 true matches and 531 true non-matches
    (27.06% true matches)
  Identified 704 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   687  (97.59%)
          2 :    14  (1.99%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 704 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.000 : 529

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 728
  Number of unique weight vectors: 704

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (704, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 704 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 704 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 620 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 131 matches and 489 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (489, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 489 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 489 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 9 matches and 66 non-matches
    Purity of oracle classification:  0.880
    Entropy of oracle classification: 0.529
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)498_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 498), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)498_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 227 true matches and 848 true non-matches
    (21.12% true matches)
  Identified 1018 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   981  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1018 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1018

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1018, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1018 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1018 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 931 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 819 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (819, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)590_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (15, 1 - acm diverg, 590), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)590_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 693
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 693 weight vectors
  Containing 212 true matches and 481 true non-matches
    (30.59% true matches)
  Identified 640 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   605  (94.53%)
          2 :    32  (5.00%)
          3 :     2  (0.31%)
         18 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 640 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 460

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 692
  Number of unique weight vectors: 640

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (640, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 640 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 640 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 557 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 151 matches and 406 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (406, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 151 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(20)251_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 251), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)251_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)336_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 336), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)336_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 718
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 718 weight vectors
  Containing 203 true matches and 515 true non-matches
    (28.27% true matches)
  Identified 692 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   678  (97.98%)
          2 :    11  (1.59%)
          3 :     2  (0.29%)
         12 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 692 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 514

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 717
  Number of unique weight vectors: 692

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (692, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 692 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 692 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 608 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 114 matches and 494 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (494, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 494 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 494 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.462, 0.609, 0.643, 0.706, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 17 matches and 55 non-matches
    Purity of oracle classification:  0.764
    Entropy of oracle classification: 0.789
    Number of true matches:      17
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)869_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 869), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)869_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 716
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 716 weight vectors
  Containing 195 true matches and 521 true non-matches
    (27.23% true matches)
  Identified 692 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   675  (97.54%)
          2 :    14  (2.02%)
          3 :     2  (0.29%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 692 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 519

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 716
  Number of unique weight vectors: 692

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (692, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 692 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 692 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 608 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 134 matches and 474 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (134, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (474, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 474 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 474 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)661_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 661), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)661_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 729
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 729 weight vectors
  Containing 221 true matches and 508 true non-matches
    (30.32% true matches)
  Identified 693 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   677  (97.69%)
          2 :    13  (1.88%)
          3 :     2  (0.29%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 693 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 507

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 728
  Number of unique weight vectors: 693

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (693, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 693 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 693 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 609 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 123 matches and 486 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (486, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 486 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 486 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.462, 0.609, 0.643, 0.706, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 17 matches and 54 non-matches
    Purity of oracle classification:  0.761
    Entropy of oracle classification: 0.794
    Number of true matches:      17
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)53_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 53), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)53_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 521
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 521 weight vectors
  Containing 206 true matches and 315 true non-matches
    (39.54% true matches)
  Identified 492 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   475  (96.54%)
          2 :    14  (2.85%)
          3 :     2  (0.41%)
         12 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 492 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 312

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 520
  Number of unique weight vectors: 492

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (492, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 492 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 492 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 33 matches and 47 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.978
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 412 weight vectors
  Based on 33 matches and 47 non-matches
  Classified 142 matches and 270 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.5875, 0.9777945702913884, 0.4125)
    (270, 0.5875, 0.9777945702913884, 0.4125)

Current size of match and non-match training data sets: 33 / 47

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 142 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 50 matches and 6 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.491
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)895_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976923
recall                 0.424749
f-measure              0.592075
da                          130
dm                            0
ndm                           0
tp                          127
fp                            3
tn                  4.76529e+07
fn                          172
Name: (10, 1 - acm diverg, 895), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)895_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 789
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 789 weight vectors
  Containing 130 true matches and 659 true non-matches
    (16.48% true matches)
  Identified 758 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   730  (96.31%)
          2 :    25  (3.30%)
          3 :     3  (0.40%)

Identified 0 non-pure unique weight vectors (from 758 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 119
     0.000 : 639

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 789
  Number of unique weight vectors: 758

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (758, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 758 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 758 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 673 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 118 matches and 555 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (555, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 118 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 118 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 35 matches and 15 non-matches
    Purity of oracle classification:  0.700
    Entropy of oracle classification: 0.881
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  15
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

130.0
Analisando o arquivo: diverg(10)952_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990099
recall                 0.334448
f-measure                   0.5
da                          101
dm                            0
ndm                           0
tp                          100
fp                            1
tn                  4.76529e+07
fn                          199
Name: (10, 1 - acm diverg, 952), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)952_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 779
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 779 weight vectors
  Containing 165 true matches and 614 true non-matches
    (21.18% true matches)
  Identified 740 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   711  (96.08%)
          2 :    26  (3.51%)
          3 :     2  (0.27%)
         10 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 740 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 146
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 593

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 778
  Number of unique weight vectors: 740

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (740, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 740 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 740 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 655 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 116 matches and 539 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (116, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (539, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 539 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 539 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.800, 0.571, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.791, 1.000, 0.275, 0.269, 0.192, 0.084, 0.200] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 2 matches and 72 non-matches
    Purity of oracle classification:  0.973
    Entropy of oracle classification: 0.179
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(20)407_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 407), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)407_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 91 matches and 857 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (857, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 857 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 857 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 18 matches and 52 non-matches
    Purity of oracle classification:  0.743
    Entropy of oracle classification: 0.822
    Number of true matches:      18
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)323_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 323), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)323_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 908
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 908 weight vectors
  Containing 212 true matches and 696 true non-matches
    (23.35% true matches)
  Identified 856 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   820  (95.79%)
          2 :    33  (3.86%)
          3 :     2  (0.23%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 856 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 675

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 907
  Number of unique weight vectors: 856

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (856, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 856 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 856 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 770 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 165 matches and 605 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (165, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (605, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 165 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 165 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 46 matches and 11 non-matches
    Purity of oracle classification:  0.807
    Entropy of oracle classification: 0.708
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)956_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 956), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)956_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1060
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1060 weight vectors
  Containing 214 true matches and 846 true non-matches
    (20.19% true matches)
  Identified 1006 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   971  (96.52%)
          2 :    32  (3.18%)
          3 :     2  (0.20%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1006 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 825

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1059
  Number of unique weight vectors: 1006

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1006, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1006 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1006 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 919 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 101 matches and 818 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (818, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 43 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)234_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 234), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)234_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 645
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 645 weight vectors
  Containing 213 true matches and 432 true non-matches
    (33.02% true matches)
  Identified 609 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   592  (97.21%)
          2 :    14  (2.30%)
          3 :     2  (0.33%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 609 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 429

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 644
  Number of unique weight vectors: 609

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (609, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 609 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 609 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 32 matches and 51 non-matches
    Purity of oracle classification:  0.614
    Entropy of oracle classification: 0.962
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 526 weight vectors
  Based on 32 matches and 51 non-matches
  Classified 150 matches and 376 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6144578313253012, 0.9618624139909456, 0.3855421686746988)
    (376, 0.6144578313253012, 0.9618624139909456, 0.3855421686746988)

Current size of match and non-match training data sets: 32 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 150 weight vectors
- Estimated match proportion 0.386

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 51 matches and 6 non-matches
    Purity of oracle classification:  0.895
    Entropy of oracle classification: 0.485
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)445_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 445), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)445_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 474
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 474 weight vectors
  Containing 191 true matches and 283 true non-matches
    (40.30% true matches)
  Identified 445 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   428  (96.18%)
          2 :    14  (3.15%)
          3 :     2  (0.45%)
         12 :     1  (0.22%)

Identified 1 non-pure unique weight vectors (from 445 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 280

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 473
  Number of unique weight vectors: 445

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (445, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 445 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 445 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 36 matches and 43 non-matches
    Purity of oracle classification:  0.544
    Entropy of oracle classification: 0.994
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 366 weight vectors
  Based on 36 matches and 43 non-matches
  Classified 292 matches and 74 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (292, 0.5443037974683544, 0.9943290455933882, 0.45569620253164556)
    (74, 0.5443037974683544, 0.9943290455933882, 0.45569620253164556)

Current size of match and non-match training data sets: 36 / 43

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 0.99
- Size 292 weight vectors
- Estimated match proportion 0.456

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 292 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.146, 0.130, 0.176, 0.318, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.472, 1.000, 0.154, 0.135, 0.196, 0.088, 0.197] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.367, 1.000, 0.154, 0.174, 0.125, 0.240, 0.226] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 44 matches and 28 non-matches
    Purity of oracle classification:  0.611
    Entropy of oracle classification: 0.964
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)173_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 173), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)173_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 537
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 537 weight vectors
  Containing 224 true matches and 313 true non-matches
    (41.71% true matches)
  Identified 498 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   479  (96.18%)
          2 :    16  (3.21%)
          3 :     2  (0.40%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 498 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 310

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 536
  Number of unique weight vectors: 498

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (498, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 498 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 498 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 33 matches and 47 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.978
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 418 weight vectors
  Based on 33 matches and 47 non-matches
  Classified 151 matches and 267 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.5875, 0.9777945702913884, 0.4125)
    (267, 0.5875, 0.9777945702913884, 0.4125)

Current size of match and non-match training data sets: 33 / 47

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 151 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.909, 1.000, 1.000, 1.000, 0.947] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 51 matches and 7 non-matches
    Purity of oracle classification:  0.879
    Entropy of oracle classification: 0.531
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)26_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 26), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)26_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 656
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 656 weight vectors
  Containing 194 true matches and 462 true non-matches
    (29.57% true matches)
  Identified 635 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   621  (97.80%)
          2 :    11  (1.73%)
          3 :     2  (0.31%)
          7 :     1  (0.16%)

Identified 0 non-pure unique weight vectors (from 635 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 462

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 656
  Number of unique weight vectors: 635

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (635, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 635 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 635 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 552 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 136 matches and 416 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (416, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 136 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 47 matches and 6 non-matches
    Purity of oracle classification:  0.887
    Entropy of oracle classification: 0.510
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)899_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 899), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)899_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)303_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 303), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)303_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 208 true matches and 867 true non-matches
    (19.35% true matches)
  Identified 1028 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   993  (96.60%)
          2 :    32  (3.11%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1028 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1028

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1028, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1028 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1028 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 940 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 123 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 123 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)731_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 731), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)731_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)427_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 427), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)427_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(10)142_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                0.9875
recall                 0.264214
f-measure              0.416887
da                           80
dm                            0
ndm                           0
tp                           79
fp                            1
tn                  4.76529e+07
fn                          220
Name: (10, 1 - acm diverg, 142), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)142_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 669
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 669 weight vectors
  Containing 178 true matches and 491 true non-matches
    (26.61% true matches)
  Identified 648 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   638  (98.46%)
          2 :     7  (1.08%)
          3 :     2  (0.31%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 648 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 157
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 490

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 668
  Number of unique weight vectors: 648

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (648, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 648 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 648 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 35 matches and 48 non-matches
    Purity of oracle classification:  0.578
    Entropy of oracle classification: 0.982
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 565 weight vectors
  Based on 35 matches and 48 non-matches
  Classified 284 matches and 281 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (284, 0.5783132530120482, 0.9822309298084992, 0.42168674698795183)
    (281, 0.5783132530120482, 0.9822309298084992, 0.42168674698795183)

Current size of match and non-match training data sets: 35 / 48

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 284 weight vectors
- Estimated match proportion 0.422

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 284 vectors
  The selected farthest weight vectors are:
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.207, 0.160, 0.185, 0.212, 0.121] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 42 matches and 28 non-matches
    Purity of oracle classification:  0.600
    Entropy of oracle classification: 0.971
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

80.0
Analisando o arquivo: diverg(15)403_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981481
recall                 0.177258
f-measure              0.300283
da                           54
dm                            0
ndm                           0
tp                           53
fp                            1
tn                  4.76529e+07
fn                          246
Name: (15, 1 - acm diverg, 403), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)403_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1056
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1056 weight vectors
  Containing 211 true matches and 845 true non-matches
    (19.98% true matches)
  Identified 1002 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   967  (96.51%)
          2 :    32  (3.19%)
          3 :     2  (0.20%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1002 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 824

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1055
  Number of unique weight vectors: 1002

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1002, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1002 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1002 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 915 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 319 matches and 596 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (319, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (596, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 319 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 319 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 41 matches and 28 non-matches
    Purity of oracle classification:  0.594
    Entropy of oracle classification: 0.974
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

54.0
Analisando o arquivo: diverg(10)620_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 620), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)620_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 683
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 683 weight vectors
  Containing 218 true matches and 465 true non-matches
    (31.92% true matches)
  Identified 628 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   592  (94.27%)
          2 :    33  (5.25%)
          3 :     2  (0.32%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 628 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 444

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 682
  Number of unique weight vectors: 628

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (628, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 628 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 628 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 545 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 165 matches and 380 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (165, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (380, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 380 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 380 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.462, 0.609, 0.684, 0.308, 0.545] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 2 matches and 69 non-matches
    Purity of oracle classification:  0.972
    Entropy of oracle classification: 0.185
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)424_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 424), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)424_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)136_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 136), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)136_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 996
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 996 weight vectors
  Containing 212 true matches and 784 true non-matches
    (21.29% true matches)
  Identified 944 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   909  (96.29%)
          2 :    32  (3.39%)
          3 :     2  (0.21%)
         17 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 944 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 763

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 995
  Number of unique weight vectors: 944

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (944, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 944 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 944 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 29 matches and 58 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 857 weight vectors
  Based on 29 matches and 58 non-matches
  Classified 147 matches and 710 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (710, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 29 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 710 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 710 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 4 matches and 72 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.297
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)872_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 872), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)872_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)747_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 747), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)747_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 752
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 752 weight vectors
  Containing 222 true matches and 530 true non-matches
    (29.52% true matches)
  Identified 716 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   697  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 716 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 527

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 751
  Number of unique weight vectors: 716

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (716, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 716 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 716 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 632 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 146 matches and 486 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (486, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 146 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 53 matches and 2 non-matches
    Purity of oracle classification:  0.964
    Entropy of oracle classification: 0.225
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)53_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (10, 1 - acm diverg, 53), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)53_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 786
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 786 weight vectors
  Containing 217 true matches and 569 true non-matches
    (27.61% true matches)
  Identified 748 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   730  (97.59%)
          2 :    15  (2.01%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 748 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 566

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 785
  Number of unique weight vectors: 748

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (748, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 748 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 748 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 663 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 142 matches and 521 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (521, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 521 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 521 vectors
  The selected farthest weight vectors are:
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 6 matches and 69 non-matches
    Purity of oracle classification:  0.920
    Entropy of oracle classification: 0.402
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)991_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 991), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)991_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)228_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 228), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)228_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 95 matches and 853 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (95, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (853, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 853 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 853 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 15 matches and 55 non-matches
    Purity of oracle classification:  0.786
    Entropy of oracle classification: 0.750
    Number of true matches:      15
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)501_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 501), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)501_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 961
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 961 weight vectors
  Containing 217 true matches and 744 true non-matches
    (22.58% true matches)
  Identified 906 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   870  (96.03%)
          2 :    33  (3.64%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 906 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 723

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 960
  Number of unique weight vectors: 906

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (906, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 906 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 906 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 819 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 135 matches and 684 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (684, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 135 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 135 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 50 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.139
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)553_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 553), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)553_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1100
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1100 weight vectors
  Containing 227 true matches and 873 true non-matches
    (20.64% true matches)
  Identified 1043 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1006  (96.45%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1043 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1099
  Number of unique weight vectors: 1043

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1043, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1043 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1043 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 955 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 846 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (846, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)364_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 364), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)364_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)314_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 314), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)314_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1085
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1085 weight vectors
  Containing 220 true matches and 865 true non-matches
    (20.28% true matches)
  Identified 1029 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   993  (96.50%)
          2 :    33  (3.21%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1029 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 844

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1084
  Number of unique weight vectors: 1029

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1029, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1029 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1029 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 941 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 125 matches and 816 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (125, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (816, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 816 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 816 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 11 matches and 60 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.622
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)340_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 340), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)340_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 131 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 131 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 48 matches and 1 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)286_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987952
recall                 0.274247
f-measure              0.429319
da                           83
dm                            0
ndm                           0
tp                           82
fp                            1
tn                  4.76529e+07
fn                          217
Name: (10, 1 - acm diverg, 286), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)286_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 340
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 340 weight vectors
  Containing 170 true matches and 170 true non-matches
    (50.00% true matches)
  Identified 320 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   309  (96.56%)
          2 :     8  (2.50%)
          3 :     2  (0.62%)
          9 :     1  (0.31%)

Identified 1 non-pure unique weight vectors (from 320 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 167

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 331
  Number of unique weight vectors: 319

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (319, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 319 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 319 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 31 matches and 43 non-matches
    Purity of oracle classification:  0.581
    Entropy of oracle classification: 0.981
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 245 weight vectors
  Based on 31 matches and 43 non-matches
  Classified 121 matches and 124 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 74
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (121, 0.581081081081081, 0.9809470132751208, 0.4189189189189189)
    (124, 0.581081081081081, 0.9809470132751208, 0.4189189189189189)

Current size of match and non-match training data sets: 31 / 43

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 121 weight vectors
- Estimated match proportion 0.419

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 121 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 47 matches and 6 non-matches
    Purity of oracle classification:  0.887
    Entropy of oracle classification: 0.510
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

83.0
Analisando o arquivo: diverg(20)181_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 181), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)181_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 862
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 862 weight vectors
  Containing 227 true matches and 635 true non-matches
    (26.33% true matches)
  Identified 805 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   768  (95.40%)
          2 :    34  (4.22%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 805 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 614

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 861
  Number of unique weight vectors: 805

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (805, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 805 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 805 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 719 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 566 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (566, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 566 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 566 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 7 matches and 66 non-matches
    Purity of oracle classification:  0.904
    Entropy of oracle classification: 0.456
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)640_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 640), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)640_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 376
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 376 weight vectors
  Containing 192 true matches and 184 true non-matches
    (51.06% true matches)
  Identified 355 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   341  (96.06%)
          2 :    11  (3.10%)
          3 :     2  (0.56%)
          7 :     1  (0.28%)

Identified 0 non-pure unique weight vectors (from 355 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.000 : 184

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 376
  Number of unique weight vectors: 355

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (355, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 355 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 355 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 38 matches and 37 non-matches
    Purity of oracle classification:  0.507
    Entropy of oracle classification: 1.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 280 weight vectors
  Based on 38 matches and 37 non-matches
  Classified 136 matches and 144 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.5066666666666667, 0.999871756640849, 0.5066666666666667)
    (144, 0.5066666666666667, 0.999871756640849, 0.5066666666666667)

Current size of match and non-match training data sets: 38 / 37

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 136 weight vectors
- Estimated match proportion 0.507

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)390_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (20, 1 - acm diverg, 390), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)390_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 963
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 963 weight vectors
  Containing 212 true matches and 751 true non-matches
    (22.01% true matches)
  Identified 910 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   875  (96.15%)
          2 :    32  (3.52%)
          3 :     2  (0.22%)
         18 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 910 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 730

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 962
  Number of unique weight vectors: 910

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (910, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 910 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 910 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 823 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 0 matches and 823 non-matches

48.0
Analisando o arquivo: diverg(20)354_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 354), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)354_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)426_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 426), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)426_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1000
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1000 weight vectors
  Containing 198 true matches and 802 true non-matches
    (19.80% true matches)
  Identified 958 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   923  (96.35%)
          2 :    32  (3.34%)
          3 :     2  (0.21%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 958 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 782

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1000
  Number of unique weight vectors: 958

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (958, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 958 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 958 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 871 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 106 matches and 765 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (765, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 106 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 46

Farthest first selection of 46 weight vectors from 106 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [0.511, 1.000, 1.000, 1.000, 1.000, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 46 weight vectors
  The oracle will correctly classify 46 weight vectors and wrongly classify 0
  Classified 46 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 46 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)878_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 878), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)878_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 251
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 251 weight vectors
  Containing 207 true matches and 44 true non-matches
    (82.47% true matches)
  Identified 220 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   205  (93.18%)
          2 :    12  (5.45%)
          3 :     2  (0.91%)
         16 :     1  (0.45%)

Identified 1 non-pure unique weight vectors (from 220 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 43

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 250
  Number of unique weight vectors: 220

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (220, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 220 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 67

Perform initial selection using "far" method

Farthest first selection of 67 weight vectors from 220 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 42 matches and 25 non-matches
    Purity of oracle classification:  0.627
    Entropy of oracle classification: 0.953
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  25
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 153 weight vectors
  Based on 42 matches and 25 non-matches
  Classified 152 matches and 1 non-matches

  Non-match cluster not large enough for required sample size
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 1
  Number of manual oracle classifications performed: 67
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.6268656716417911, 0.9530483471581299, 0.6268656716417911)

Current size of match and non-match training data sets: 42 / 25

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 152 weight vectors
- Estimated match proportion 0.627

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 152 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 41 matches and 16 non-matches
    Purity of oracle classification:  0.719
    Entropy of oracle classification: 0.856
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  16
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)854_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 854), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)854_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 946
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 946 weight vectors
  Containing 219 true matches and 727 true non-matches
    (23.15% true matches)
  Identified 891 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   855  (95.96%)
          2 :    33  (3.70%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 891 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 706

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 945
  Number of unique weight vectors: 891

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (891, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 891 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 891 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 805 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 130 matches and 675 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (675, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 675 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 675 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 11 matches and 58 non-matches
    Purity of oracle classification:  0.841
    Entropy of oracle classification: 0.633
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)438_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 438), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)438_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 706 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 706 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)162_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 162), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)162_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(20)25_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 25), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)25_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 146 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (538, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 538 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 538 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 9 matches and 65 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.534
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)394_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 394), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)394_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 0 matches and 829 non-matches

40.0
Analisando o arquivo: diverg(15)79_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 79), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)79_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 951
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 951 weight vectors
  Containing 218 true matches and 733 true non-matches
    (22.92% true matches)
  Identified 896 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   860  (95.98%)
          2 :    33  (3.68%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 896 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 712

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 950
  Number of unique weight vectors: 896

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (896, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 896 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 896 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 810 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 156 matches and 654 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (654, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 156 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)749_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987952
recall                 0.274247
f-measure              0.429319
da                           83
dm                            0
ndm                           0
tp                           82
fp                            1
tn                  4.76529e+07
fn                          217
Name: (10, 1 - acm diverg, 749), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)749_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 600
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 600 weight vectors
  Containing 170 true matches and 430 true non-matches
    (28.33% true matches)
  Identified 580 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   569  (98.10%)
          2 :     8  (1.38%)
          3 :     2  (0.34%)
          9 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 580 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 427

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 591
  Number of unique weight vectors: 579

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (579, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 579 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 579 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 497 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 121 matches and 376 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (121, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (376, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 376 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 376 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.600, 0.700, 0.600, 0.611, 0.706] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.826, 0.286, 0.857, 0.643] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 4 matches and 65 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.319
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

83.0
Analisando o arquivo: diverg(20)791_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 791), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)791_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 854
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 854 weight vectors
  Containing 226 true matches and 628 true non-matches
    (26.46% true matches)
  Identified 797 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   760  (95.36%)
          2 :    34  (4.27%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 797 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 607

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 853
  Number of unique weight vectors: 797

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (797, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 797 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 797 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 712 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 148 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (564, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 564 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 564 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 10 matches and 62 non-matches
    Purity of oracle classification:  0.861
    Entropy of oracle classification: 0.581
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)803_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 803), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)803_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(10)754_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 754), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)754_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 902
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 902 weight vectors
  Containing 178 true matches and 724 true non-matches
    (19.73% true matches)
  Identified 863 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   833  (96.52%)
          2 :    27  (3.13%)
          3 :     2  (0.23%)
          9 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 863 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 703

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 893
  Number of unique weight vectors: 862

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (862, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 862 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 862 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 776 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 94 matches and 682 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (682, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 682 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 682 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 14 matches and 55 non-matches
    Purity of oracle classification:  0.797
    Entropy of oracle classification: 0.728
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(15)575_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 575), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)575_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1069
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1069 weight vectors
  Containing 221 true matches and 848 true non-matches
    (20.67% true matches)
  Identified 1013 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   977  (96.45%)
          2 :    33  (3.26%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1013 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1068
  Number of unique weight vectors: 1013

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1013, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1013 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1013 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 926 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 106 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 106 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 106 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 44 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)497_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 497), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)497_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 223 true matches and 585 true non-matches
    (27.60% true matches)
  Identified 754 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   717  (95.09%)
          2 :    34  (4.51%)
          3 :     2  (0.27%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 754 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 564

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 754

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (754, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 754 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 754 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 669 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 93 matches and 576 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (93, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (576, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 576 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 576 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 20 matches and 53 non-matches
    Purity of oracle classification:  0.726
    Entropy of oracle classification: 0.847
    Number of true matches:      20
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)994_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.197324
f-measure              0.329609
da                           59
dm                            0
ndm                           0
tp                           59
fp                            0
tn                  4.76529e+07
fn                          240
Name: (10, 1 - acm diverg, 994), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)994_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 693
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 693 weight vectors
  Containing 198 true matches and 495 true non-matches
    (28.57% true matches)
  Identified 648 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   614  (94.75%)
          2 :    31  (4.78%)
          3 :     2  (0.31%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 648 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 474

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 692
  Number of unique weight vectors: 648

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (648, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 648 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 648 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 565 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 155 matches and 410 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (410, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 155 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 155 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 44 matches and 11 non-matches
    Purity of oracle classification:  0.800
    Entropy of oracle classification: 0.722
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

59.0
Analisando o arquivo: diverg(15)303_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 303), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)303_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1043
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1043 weight vectors
  Containing 222 true matches and 821 true non-matches
    (21.28% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   952  (96.26%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 800

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1042
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 145 matches and 757 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (757, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 145 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)18_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 18), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)18_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 934
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 934 weight vectors
  Containing 200 true matches and 734 true non-matches
    (21.41% true matches)
  Identified 889 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   855  (96.18%)
          2 :    31  (3.49%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 889 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 713

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 933
  Number of unique weight vectors: 889

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (889, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 889 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 889 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 803 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 155 matches and 648 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (155, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (648, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 155 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 155 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 46 matches and 9 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.643
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)197_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 197), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)197_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 479
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 479 weight vectors
  Containing 220 true matches and 259 true non-matches
    (45.93% true matches)
  Identified 443 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   424  (95.71%)
          2 :    16  (3.61%)
          3 :     2  (0.45%)
         17 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 443 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 256

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 478
  Number of unique weight vectors: 443

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (443, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 443 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 443 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 37 matches and 42 non-matches
    Purity of oracle classification:  0.532
    Entropy of oracle classification: 0.997
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  42
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 364 weight vectors
  Based on 37 matches and 42 non-matches
  Classified 154 matches and 210 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (154, 0.5316455696202531, 0.9971085167216718, 0.46835443037974683)
    (210, 0.5316455696202531, 0.9971085167216718, 0.46835443037974683)

Current size of match and non-match training data sets: 37 / 42

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 154 weight vectors
- Estimated match proportion 0.468

Sample size for this cluster: 59

Farthest first selection of 59 weight vectors from 154 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 59 weight vectors
  The oracle will correctly classify 59 weight vectors and wrongly classify 0
  Classified 49 matches and 10 non-matches
    Purity of oracle classification:  0.831
    Entropy of oracle classification: 0.657
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  10
    Number of false non-matches: 0

Deleted 59 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)595_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (15, 1 - acm diverg, 595), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)595_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 904
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 904 weight vectors
  Containing 178 true matches and 726 true non-matches
    (19.69% true matches)
  Identified 865 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   835  (96.53%)
          2 :    27  (3.12%)
          3 :     2  (0.23%)
          9 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 865 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 159
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 705

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 895
  Number of unique weight vectors: 864

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (864, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 864 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 864 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 24 matches and 62 non-matches
    Purity of oracle classification:  0.721
    Entropy of oracle classification: 0.854
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 778 weight vectors
  Based on 24 matches and 62 non-matches
  Classified 94 matches and 684 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)
    (684, 0.7209302325581395, 0.8541802051521675, 0.27906976744186046)

Current size of match and non-match training data sets: 24 / 62

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 94 weight vectors
- Estimated match proportion 0.279

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 94 vectors
  The selected farthest weight vectors are:
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 43 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(20)776_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 776), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)776_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 793 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)866_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990385
recall                 0.344482
f-measure              0.511166
da                          104
dm                            0
ndm                           0
tp                          103
fp                            1
tn                  4.76529e+07
fn                          196
Name: (10, 1 - acm diverg, 866), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)866_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 861
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 861 weight vectors
  Containing 154 true matches and 707 true non-matches
    (17.89% true matches)
  Identified 825 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   797  (96.61%)
          2 :    25  (3.03%)
          3 :     2  (0.24%)
          8 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 825 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 686

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 853
  Number of unique weight vectors: 824

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (824, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 824 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 824 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 738 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 118 matches and 620 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (620, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 620 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 620 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

104.0
Analisando o arquivo: diverg(15)669_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 669), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)669_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 754
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 754 weight vectors
  Containing 222 true matches and 532 true non-matches
    (29.44% true matches)
  Identified 718 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   699  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 718 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 529

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 753
  Number of unique weight vectors: 718

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (718, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 718 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 718 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 634 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 135 matches and 499 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (499, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 499 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 499 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 13 matches and 60 non-matches
    Purity of oracle classification:  0.822
    Entropy of oracle classification: 0.676
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)530_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 530), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)530_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 928
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 928 weight vectors
  Containing 216 true matches and 712 true non-matches
    (23.28% true matches)
  Identified 873 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   837  (95.88%)
          2 :    33  (3.78%)
          3 :     2  (0.23%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 873 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 691

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 927
  Number of unique weight vectors: 873

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (873, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 873 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 873 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 25 matches and 61 non-matches
    Purity of oracle classification:  0.709
    Entropy of oracle classification: 0.870
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 787 weight vectors
  Based on 25 matches and 61 non-matches
  Classified 146 matches and 641 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)
    (641, 0.7093023255813954, 0.8696207740543749, 0.29069767441860467)

Current size of match and non-match training data sets: 25 / 61

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 146 weight vectors
- Estimated match proportion 0.291

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 146 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 50 matches and 2 non-matches
    Purity of oracle classification:  0.962
    Entropy of oracle classification: 0.235
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)493_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (15, 1 - acm diverg, 493), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)493_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 187 true matches and 865 true non-matches
    (17.78% true matches)
  Identified 1010 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   979  (96.93%)
          2 :    28  (2.77%)
          3 :     2  (0.20%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1010 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 165
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 844

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 1010

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1010, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1010 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1010 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 923 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 91 matches and 832 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (832, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 91 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 91 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 42 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(20)568_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 568), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)568_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)990_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 990), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)990_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 275
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 275 weight vectors
  Containing 199 true matches and 76 true non-matches
    (72.36% true matches)
  Identified 243 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   230  (94.65%)
          2 :    10  (4.12%)
          3 :     2  (0.82%)
         19 :     1  (0.41%)

Identified 1 non-pure unique weight vectors (from 243 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 75

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 274
  Number of unique weight vectors: 243

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (243, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 243 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 69

Perform initial selection using "far" method

Farthest first selection of 69 weight vectors from 243 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 35 matches and 34 non-matches
    Purity of oracle classification:  0.507
    Entropy of oracle classification: 1.000
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  34
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 174 weight vectors
  Based on 35 matches and 34 non-matches
  Classified 138 matches and 36 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 69
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.5072463768115942, 0.9998484829291058, 0.5072463768115942)
    (36, 0.5072463768115942, 0.9998484829291058, 0.5072463768115942)

Current size of match and non-match training data sets: 35 / 34

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 138 weight vectors
- Estimated match proportion 0.507

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 138 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 52 matches and 5 non-matches
    Purity of oracle classification:  0.912
    Entropy of oracle classification: 0.429
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)110_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 110), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)110_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 793 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)891_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 891), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)891_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(15)662_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 662), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)662_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 908
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 908 weight vectors
  Containing 212 true matches and 696 true non-matches
    (23.35% true matches)
  Identified 856 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   820  (95.79%)
          2 :    33  (3.86%)
          3 :     2  (0.23%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 856 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 675

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 907
  Number of unique weight vectors: 856

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (856, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 856 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 856 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 26 matches and 60 non-matches
    Purity of oracle classification:  0.698
    Entropy of oracle classification: 0.884
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 770 weight vectors
  Based on 26 matches and 60 non-matches
  Classified 128 matches and 642 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (128, 0.6976744186046512, 0.8841151220488478, 0.3023255813953488)
    (642, 0.6976744186046512, 0.8841151220488478, 0.3023255813953488)

Current size of match and non-match training data sets: 26 / 60

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 642 weight vectors
- Estimated match proportion 0.302

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 642 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 13 matches and 59 non-matches
    Purity of oracle classification:  0.819
    Entropy of oracle classification: 0.681
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)255_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 255), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)255_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 706 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (706, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 123 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 123 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 47 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)443_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 443), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)443_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 750
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 750 weight vectors
  Containing 222 true matches and 528 true non-matches
    (29.60% true matches)
  Identified 714 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   695  (97.34%)
          2 :    16  (2.24%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 714 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 525

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 749
  Number of unique weight vectors: 714

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (714, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 714 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 714 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 28 matches and 56 non-matches
    Purity of oracle classification:  0.667
    Entropy of oracle classification: 0.918
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 630 weight vectors
  Based on 28 matches and 56 non-matches
  Classified 133 matches and 497 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)
    (497, 0.6666666666666666, 0.9182958340544896, 0.3333333333333333)

Current size of match and non-match training data sets: 28 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.92
- Size 497 weight vectors
- Estimated match proportion 0.333

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 497 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 13 matches and 60 non-matches
    Purity of oracle classification:  0.822
    Entropy of oracle classification: 0.676
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)112_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 112), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)112_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1065
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1065 weight vectors
  Containing 209 true matches and 856 true non-matches
    (19.62% true matches)
  Identified 1018 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   983  (96.56%)
          2 :    32  (3.14%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1018 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 835

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1064
  Number of unique weight vectors: 1018

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1018, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1018 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1018 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 931 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 139 matches and 792 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (139, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (792, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 792 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 792 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.667, 0.500, 0.647, 0.556, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.750, 0.429, 0.526, 0.500, 0.846] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.233, 0.545, 0.714, 0.455, 0.238] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.462, 0.889, 0.455, 0.211, 0.375] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.583, 0.444, 0.412, 0.318, 0.421] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 9 matches and 65 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.534
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)119_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990099
recall                 0.334448
f-measure                   0.5
da                          101
dm                            0
ndm                           0
tp                          100
fp                            1
tn                  4.76529e+07
fn                          199
Name: (10, 1 - acm diverg, 119), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)119_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 748
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 748 weight vectors
  Containing 165 true matches and 583 true non-matches
    (22.06% true matches)
  Identified 709 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   680  (95.91%)
          2 :    26  (3.67%)
          3 :     2  (0.28%)
         10 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 709 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 146
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 562

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 747
  Number of unique weight vectors: 709

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (709, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 709 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 709 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 625 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 120 matches and 505 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (120, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (505, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 120 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 120 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 40 matches and 11 non-matches
    Purity of oracle classification:  0.784
    Entropy of oracle classification: 0.752
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(15)985_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 985), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)985_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 847
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 847 weight vectors
  Containing 220 true matches and 627 true non-matches
    (25.97% true matches)
  Identified 791 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   755  (95.45%)
          2 :    33  (4.17%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 791 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 606

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 846
  Number of unique weight vectors: 791

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (791, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 791 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 791 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 706 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 142 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (564, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 564 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 564 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 10 matches and 62 non-matches
    Purity of oracle classification:  0.861
    Entropy of oracle classification: 0.581
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)707_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 707), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)707_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 27 matches and 61 non-matches
    Purity of oracle classification:  0.693
    Entropy of oracle classification: 0.889
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 27 matches and 61 non-matches
  Classified 148 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6931818181818182, 0.8894663896628687, 0.3068181818181818)
    (800, 0.6931818181818182, 0.8894663896628687, 0.3068181818181818)

Current size of match and non-match training data sets: 27 / 61

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 800 weight vectors
- Estimated match proportion 0.307

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 800 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 9 matches and 65 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.534
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)347_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 347), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)347_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 928
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 928 weight vectors
  Containing 218 true matches and 710 true non-matches
    (23.49% true matches)
  Identified 873 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   837  (95.88%)
          2 :    33  (3.78%)
          3 :     2  (0.23%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 873 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 689

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 927
  Number of unique weight vectors: 873

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (873, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 873 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 873 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 787 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 166 matches and 621 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (166, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (621, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 166 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 166 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 45 matches and 11 non-matches
    Purity of oracle classification:  0.804
    Entropy of oracle classification: 0.715
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)854_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 854), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)854_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 777
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 777 weight vectors
  Containing 223 true matches and 554 true non-matches
    (28.70% true matches)
  Identified 738 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   719  (97.43%)
          2 :    16  (2.17%)
          3 :     2  (0.27%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 738 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 551

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 776
  Number of unique weight vectors: 738

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (738, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 738 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 738 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 36 matches and 49 non-matches
    Purity of oracle classification:  0.576
    Entropy of oracle classification: 0.983
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 653 weight vectors
  Based on 36 matches and 49 non-matches
  Classified 234 matches and 419 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (234, 0.5764705882352941, 0.9830605548016025, 0.4235294117647059)
    (419, 0.5764705882352941, 0.9830605548016025, 0.4235294117647059)

Current size of match and non-match training data sets: 36 / 49

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 419 weight vectors
- Estimated match proportion 0.424

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 419 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.462, 0.667, 0.600, 0.389, 0.615] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.500, 0.739, 0.824, 0.591, 0.550] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)168_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 168), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)168_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.300, 0.684, 0.833, 0.556, 0.433] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.429, 0.786, 0.750, 0.389, 0.857] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 147 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (537, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 537 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 537 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.556, 0.429, 0.500, 0.700, 0.643] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 7 matches and 68 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.447
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)451_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 451), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)451_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 907
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 907 weight vectors
  Containing 157 true matches and 750 true non-matches
    (17.31% true matches)
  Identified 871 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   843  (96.79%)
          2 :    25  (2.87%)
          3 :     2  (0.23%)
          8 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 871 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 141
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 729

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 899
  Number of unique weight vectors: 870

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (870, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 870 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 870 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 23 matches and 63 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.838
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 784 weight vectors
  Based on 23 matches and 63 non-matches
  Classified 71 matches and 713 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (71, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)
    (713, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)

Current size of match and non-match training data sets: 23 / 63

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.84
- Size 713 weight vectors
- Estimated match proportion 0.267

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 713 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(15)399_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (15, 1 - acm diverg, 399), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)399_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 908
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 908 weight vectors
  Containing 157 true matches and 751 true non-matches
    (17.29% true matches)
  Identified 872 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   844  (96.79%)
          2 :    25  (2.87%)
          3 :     2  (0.23%)
          8 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 872 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 141
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 730

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 900
  Number of unique weight vectors: 871

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (871, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 871 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 871 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 23 matches and 63 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.838
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 785 weight vectors
  Based on 23 matches and 63 non-matches
  Classified 71 matches and 714 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (71, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)
    (714, 0.7325581395348837, 0.837769869006679, 0.26744186046511625)

Current size of match and non-match training data sets: 23 / 63

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.84
- Size 714 weight vectors
- Estimated match proportion 0.267

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 714 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(15)962_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 962), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)962_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 480
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 480 weight vectors
  Containing 212 true matches and 268 true non-matches
    (44.17% true matches)
  Identified 446 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   429  (96.19%)
          2 :    14  (3.14%)
          3 :     2  (0.45%)
         17 :     1  (0.22%)

Identified 1 non-pure unique weight vectors (from 446 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 265

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 479
  Number of unique weight vectors: 446

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (446, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 446 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 446 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 35 matches and 44 non-matches
    Purity of oracle classification:  0.557
    Entropy of oracle classification: 0.991
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 367 weight vectors
  Based on 35 matches and 44 non-matches
  Classified 140 matches and 227 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.5569620253164557, 0.9906174973781801, 0.4430379746835443)
    (227, 0.5569620253164557, 0.9906174973781801, 0.4430379746835443)

Current size of match and non-match training data sets: 35 / 44

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 227 weight vectors
- Estimated match proportion 0.443

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 227 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 6 matches and 61 non-matches
    Purity of oracle classification:  0.910
    Entropy of oracle classification: 0.435
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)513_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (15, 1 - acm diverg, 513), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)513_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 907
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 907 weight vectors
  Containing 204 true matches and 703 true non-matches
    (22.49% true matches)
  Identified 858 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   824  (96.04%)
          2 :    31  (3.61%)
          3 :     2  (0.23%)
         15 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 858 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 682

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 906
  Number of unique weight vectors: 858

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (858, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 858 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 858 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 772 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 149 matches and 623 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (623, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 623 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 623 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 2 matches and 72 non-matches
    Purity of oracle classification:  0.973
    Entropy of oracle classification: 0.179
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)452_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 452), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)452_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 227 true matches and 848 true non-matches
    (21.12% true matches)
  Identified 1018 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   981  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1018 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1018

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1018, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1018 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1018 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 931 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 819 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (819, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)327_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 327), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)327_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 461
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 461 weight vectors
  Containing 197 true matches and 264 true non-matches
    (42.73% true matches)
  Identified 437 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   420  (96.11%)
          2 :    14  (3.20%)
          3 :     2  (0.46%)
          7 :     1  (0.23%)

Identified 0 non-pure unique weight vectors (from 437 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.000 : 262

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 461
  Number of unique weight vectors: 437

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (437, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 437 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 437 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 34 matches and 45 non-matches
    Purity of oracle classification:  0.570
    Entropy of oracle classification: 0.986
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 358 weight vectors
  Based on 34 matches and 45 non-matches
  Classified 136 matches and 222 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.569620253164557, 0.9859690274511927, 0.43037974683544306)
    (222, 0.569620253164557, 0.9859690274511927, 0.43037974683544306)

Current size of match and non-match training data sets: 34 / 45

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.99
- Size 222 weight vectors
- Estimated match proportion 0.430

Sample size for this cluster: 66

Farthest first selection of 66 weight vectors from 222 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 66 weight vectors
  The oracle will correctly classify 66 weight vectors and wrongly classify 0
  Classified 4 matches and 62 non-matches
    Purity of oracle classification:  0.939
    Entropy of oracle classification: 0.330
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 66 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)610_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 610), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)610_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 510
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 510 weight vectors
  Containing 207 true matches and 303 true non-matches
    (40.59% true matches)
  Identified 481 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   464  (96.47%)
          2 :    14  (2.91%)
          3 :     2  (0.42%)
         12 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 481 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 300

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 509
  Number of unique weight vectors: 481

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (481, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 481 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 481 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 37 matches and 43 non-matches
    Purity of oracle classification:  0.537
    Entropy of oracle classification: 0.996
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 401 weight vectors
  Based on 37 matches and 43 non-matches
  Classified 300 matches and 101 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (300, 0.5375, 0.9959386076315955, 0.4625)
    (101, 0.5375, 0.9959386076315955, 0.4625)

Current size of match and non-match training data sets: 37 / 43

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 1.00
- Size 101 weight vectors
- Estimated match proportion 0.463

Sample size for this cluster: 49

Farthest first selection of 49 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 49 weight vectors
  The oracle will correctly classify 49 weight vectors and wrongly classify 0
  Classified 1 matches and 48 non-matches
    Purity of oracle classification:  0.980
    Entropy of oracle classification: 0.144
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 49 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)368_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 368), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)368_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 620
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 620 weight vectors
  Containing 160 true matches and 460 true non-matches
    (25.81% true matches)
  Identified 586 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   556  (94.88%)
          2 :    27  (4.61%)
          3 :     2  (0.34%)
          4 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 586 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 146
     0.000 : 440

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 620
  Number of unique weight vectors: 586

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (586, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 586 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 586 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 25 matches and 57 non-matches
    Purity of oracle classification:  0.695
    Entropy of oracle classification: 0.887
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 504 weight vectors
  Based on 25 matches and 57 non-matches
  Classified 97 matches and 407 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (97, 0.6951219512195121, 0.8871723027673717, 0.3048780487804878)
    (407, 0.6951219512195121, 0.8871723027673717, 0.3048780487804878)

Current size of match and non-match training data sets: 25 / 57

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.89
- Size 407 weight vectors
- Estimated match proportion 0.305

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 407 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(15)888_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 888), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)888_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 624
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 624 weight vectors
  Containing 195 true matches and 429 true non-matches
    (31.25% true matches)
  Identified 598 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   583  (97.49%)
          2 :    12  (2.01%)
          3 :     2  (0.33%)
         11 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 598 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 426

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 623
  Number of unique weight vectors: 598

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (598, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 598 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 598 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 32 matches and 51 non-matches
    Purity of oracle classification:  0.614
    Entropy of oracle classification: 0.962
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 515 weight vectors
  Based on 32 matches and 51 non-matches
  Classified 143 matches and 372 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (143, 0.6144578313253012, 0.9618624139909456, 0.3855421686746988)
    (372, 0.6144578313253012, 0.9618624139909456, 0.3855421686746988)

Current size of match and non-match training data sets: 32 / 51

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 143 weight vectors
- Estimated match proportion 0.386

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 143 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 50 matches and 6 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.491
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)453_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (10, 1 - acm diverg, 453), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)453_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 386
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 386 weight vectors
  Containing 194 true matches and 192 true non-matches
    (50.26% true matches)
  Identified 365 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   351  (96.16%)
          2 :    11  (3.01%)
          3 :     2  (0.55%)
          7 :     1  (0.27%)

Identified 0 non-pure unique weight vectors (from 365 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 192

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 386
  Number of unique weight vectors: 365

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (365, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 365 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 365 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 38 matches and 38 non-matches
    Purity of oracle classification:  0.500
    Entropy of oracle classification: 1.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 289 weight vectors
  Based on 38 matches and 38 non-matches
  Classified 133 matches and 156 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (133, 0.5, 1.0, 0.5)
    (156, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 38 / 38

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 156 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 60

Farthest first selection of 60 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.902, 1.000, 0.182, 0.071, 0.182, 0.222, 0.190] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.512, 1.000, 0.087, 0.190, 0.107, 0.226, 0.204] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 1.000, 0.224, 0.219, 0.140, 0.209, 0.161] (False)
    [0.663, 1.000, 0.132, 0.143, 0.241, 0.174, 0.167] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.747, 1.000, 0.231, 0.167, 0.107, 0.222, 0.125] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 60 weight vectors
  The oracle will correctly classify 60 weight vectors and wrongly classify 0
  Classified 8 matches and 52 non-matches
    Purity of oracle classification:  0.867
    Entropy of oracle classification: 0.567
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 60 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)61_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 61), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)61_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1058
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1058 weight vectors
  Containing 209 true matches and 849 true non-matches
    (19.75% true matches)
  Identified 1011 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   976  (96.54%)
          2 :    32  (3.17%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1011 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1057
  Number of unique weight vectors: 1011

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1011, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1011 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1011 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 924 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 104 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (104, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 104 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 104 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)652_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 652), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)652_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 879
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 879 weight vectors
  Containing 210 true matches and 669 true non-matches
    (23.89% true matches)
  Identified 827 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   791  (95.65%)
          2 :    33  (3.99%)
          3 :     2  (0.24%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 827 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 648

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 878
  Number of unique weight vectors: 827

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (827, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 827 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 827 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 32 matches and 54 non-matches
    Purity of oracle classification:  0.628
    Entropy of oracle classification: 0.952
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 741 weight vectors
  Based on 32 matches and 54 non-matches
  Classified 165 matches and 576 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (165, 0.627906976744186, 0.9522656254366642, 0.37209302325581395)
    (576, 0.627906976744186, 0.9522656254366642, 0.37209302325581395)

Current size of match and non-match training data sets: 32 / 54

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 576 weight vectors
- Estimated match proportion 0.372

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 576 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.857, 0.417, 0.750, 0.500, 0.455] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)601_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 601), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)601_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 532
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 532 weight vectors
  Containing 213 true matches and 319 true non-matches
    (40.04% true matches)
  Identified 496 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   479  (96.57%)
          2 :    14  (2.82%)
          3 :     2  (0.40%)
         19 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 496 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 316

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 531
  Number of unique weight vectors: 496

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (496, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 496 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 496 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 34 matches and 46 non-matches
    Purity of oracle classification:  0.575
    Entropy of oracle classification: 0.984
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 416 weight vectors
  Based on 34 matches and 46 non-matches
  Classified 140 matches and 276 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.575, 0.9837082626231857, 0.425)
    (276, 0.575, 0.9837082626231857, 0.425)

Current size of match and non-match training data sets: 34 / 46

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.98
- Size 276 weight vectors
- Estimated match proportion 0.425

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 276 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 5 matches and 65 non-matches
    Purity of oracle classification:  0.929
    Entropy of oracle classification: 0.371
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)291_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 291), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)291_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)453_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (15, 1 - acm diverg, 453), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)453_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1032
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1032 weight vectors
  Containing 166 true matches and 866 true non-matches
    (16.09% true matches)
  Identified 993 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   964  (97.08%)
          2 :    26  (2.62%)
          3 :     2  (0.20%)
         10 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 993 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 147
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1031
  Number of unique weight vectors: 993

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (993, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 993 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 993 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 906 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 73 matches and 833 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (73, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (833, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 73 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 38

Farthest first selection of 38 weight vectors from 73 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)

Perform oracle with 100.00 accuracy on 38 weight vectors
  The oracle will correctly classify 38 weight vectors and wrongly classify 0
  Classified 38 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 38 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(15)133_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (15, 1 - acm diverg, 133), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)133_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 569
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 569 weight vectors
  Containing 182 true matches and 387 true non-matches
    (31.99% true matches)
  Identified 547 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   531  (97.07%)
          2 :    13  (2.38%)
          3 :     2  (0.37%)
          6 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 547 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.000 : 385

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 569
  Number of unique weight vectors: 547

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (547, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 547 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 547 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 30 matches and 51 non-matches
    Purity of oracle classification:  0.630
    Entropy of oracle classification: 0.951
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 466 weight vectors
  Based on 30 matches and 51 non-matches
  Classified 141 matches and 325 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6296296296296297, 0.9509560484549725, 0.37037037037037035)
    (325, 0.6296296296296297, 0.9509560484549725, 0.37037037037037035)

Current size of match and non-match training data sets: 30 / 51

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 325 weight vectors
- Estimated match proportion 0.370

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 325 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.579, 0.583, 0.522, 0.417, 0.563] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.333, 0.667, 0.400, 0.583, 0.563] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.545, 0.667, 0.571, 0.700, 0.667] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.783, 0.583, 0.435, 0.765, 0.429] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 0 matches and 70 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(15)698_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 698), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)698_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 812
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 812 weight vectors
  Containing 227 true matches and 585 true non-matches
    (27.96% true matches)
  Identified 755 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   718  (95.10%)
          2 :    34  (4.50%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 755 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 564

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 811
  Number of unique weight vectors: 755

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (755, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 755 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 755 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 670 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 159 matches and 511 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (511, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 511 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 511 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.385, 0.478, 0.643, 0.692, 0.611] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.462, 0.609, 0.684, 0.308, 0.545] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.791, 1.000, 0.275, 0.269, 0.192, 0.084, 0.200] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 4 matches and 70 non-matches
    Purity of oracle classification:  0.946
    Entropy of oracle classification: 0.303
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)534_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 534), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)534_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1010
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1010 weight vectors
  Containing 223 true matches and 787 true non-matches
    (22.08% true matches)
  Identified 956 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   919  (96.13%)
          2 :    34  (3.56%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 956 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 766

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1009
  Number of unique weight vectors: 956

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (956, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 956 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 956 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 869 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 165 matches and 704 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (165, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (704, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 704 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 704 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.545, 0.526, 0.818, 0.722] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 2 matches and 75 non-matches
    Purity of oracle classification:  0.974
    Entropy of oracle classification: 0.174
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)59_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 59), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)59_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 732
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 732 weight vectors
  Containing 184 true matches and 548 true non-matches
    (25.14% true matches)
  Identified 708 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   695  (98.16%)
          2 :    10  (1.41%)
          3 :     2  (0.28%)
         11 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 708 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 545

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 731
  Number of unique weight vectors: 708

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (708, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 708 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 708 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 36 matches and 48 non-matches
    Purity of oracle classification:  0.571
    Entropy of oracle classification: 0.985
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 624 weight vectors
  Based on 36 matches and 48 non-matches
  Classified 261 matches and 363 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (261, 0.5714285714285714, 0.9852281360342516, 0.42857142857142855)
    (363, 0.5714285714285714, 0.9852281360342516, 0.42857142857142855)

Current size of match and non-match training data sets: 36 / 48

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.99
- Size 363 weight vectors
- Estimated match proportion 0.429

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 363 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.462, 0.667, 0.600, 0.389, 0.615] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.500, 0.739, 0.824, 0.591, 0.550] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 1 matches and 74 non-matches
    Purity of oracle classification:  0.987
    Entropy of oracle classification: 0.102
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(10)102_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987952
recall                 0.274247
f-measure              0.429319
da                           83
dm                            0
ndm                           0
tp                           82
fp                            1
tn                  4.76529e+07
fn                          217
Name: (10, 1 - acm diverg, 102), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)102_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 907
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 907 weight vectors
  Containing 175 true matches and 732 true non-matches
    (19.29% true matches)
  Identified 868 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   838  (96.54%)
          2 :    27  (3.11%)
          3 :     2  (0.23%)
          9 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 868 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 156
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 711

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 898
  Number of unique weight vectors: 867

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (867, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 867 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 867 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 781 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 130 matches and 651 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (130, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (651, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 651 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 651 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 2 matches and 73 non-matches
    Purity of oracle classification:  0.973
    Entropy of oracle classification: 0.177
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

83.0
Analisando o arquivo: diverg(20)150_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 150), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)150_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)127_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 127), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)127_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 630
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 630 weight vectors
  Containing 186 true matches and 444 true non-matches
    (29.52% true matches)
  Identified 590 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   556  (94.24%)
          2 :    31  (5.25%)
          3 :     2  (0.34%)
          6 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 590 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.000 : 424

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 630
  Number of unique weight vectors: 590

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (590, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 590 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 590 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 508 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 149 matches and 359 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (359, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 149 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 45 matches and 9 non-matches
    Purity of oracle classification:  0.833
    Entropy of oracle classification: 0.650
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(15)528_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (15, 1 - acm diverg, 528), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)528_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 968
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 968 weight vectors
  Containing 143 true matches and 825 true non-matches
    (14.77% true matches)
  Identified 934 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   905  (96.90%)
          2 :    26  (2.78%)
          3 :     2  (0.21%)
          5 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 934 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 129
     0.000 : 805

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 968
  Number of unique weight vectors: 934

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (934, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 934 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 934 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 847 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 91 matches and 756 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (756, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 91 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 91 vectors
  The selected farthest weight vectors are:
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 40 matches and 3 non-matches
    Purity of oracle classification:  0.930
    Entropy of oracle classification: 0.365
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(15)277_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 277), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)277_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 689
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 689 weight vectors
  Containing 219 true matches and 470 true non-matches
    (31.79% true matches)
  Identified 656 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   640  (97.56%)
          2 :    13  (1.98%)
          3 :     2  (0.30%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 656 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 186
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 469

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 688
  Number of unique weight vectors: 656

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (656, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 656 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 656 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 572 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 128 matches and 444 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (128, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (444, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 444 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 444 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.409, 0.654, 0.500, 0.516, 0.333] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.786, 0.833, 0.545, 0.478, 0.346] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.704, 0.600, 0.333, 0.370, 0.188] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 16 matches and 57 non-matches
    Purity of oracle classification:  0.781
    Entropy of oracle classification: 0.759
    Number of true matches:      16
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)606_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979167
recall                 0.157191
f-measure              0.270893
da                           48
dm                            0
ndm                           0
tp                           47
fp                            1
tn                  4.76529e+07
fn                          252
Name: (10, 1 - acm diverg, 606), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)606_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 247
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 247 weight vectors
  Containing 167 true matches and 80 true non-matches
    (67.61% true matches)
  Identified 217 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   205  (94.47%)
          2 :     9  (4.15%)
          3 :     2  (0.92%)
         18 :     1  (0.46%)

Identified 1 non-pure unique weight vectors (from 217 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 137
     0.944 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 79

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 246
  Number of unique weight vectors: 217

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (217, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 217 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 67

Perform initial selection using "far" method

Farthest first selection of 67 weight vectors from 217 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 30 matches and 37 non-matches
    Purity of oracle classification:  0.552
    Entropy of oracle classification: 0.992
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 150 weight vectors
  Based on 30 matches and 37 non-matches
  Classified 110 matches and 40 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 67
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (110, 0.5522388059701493, 0.99211169200215, 0.44776119402985076)
    (40, 0.5522388059701493, 0.99211169200215, 0.44776119402985076)

Current size of match and non-match training data sets: 30 / 37

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 40 weight vectors
- Estimated match proportion 0.448

Sample size for this cluster: 29

Farthest first selection of 29 weight vectors from 40 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.442, 1.000, 0.235, 0.184, 0.120, 0.167, 0.185] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.456, 1.000, 0.087, 0.208, 0.125, 0.152, 0.061] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.640, 1.000, 0.176, 0.156, 0.200, 0.158, 0.000] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.650, 1.000, 0.194, 0.167, 0.167, 0.233, 0.204] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.800, 1.000, 0.242, 0.121, 0.200, 0.171, 0.000] (False)
    [0.800, 1.000, 0.167, 0.180, 0.151, 0.147, 0.203] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)

Perform oracle with 100.00 accuracy on 29 weight vectors
  The oracle will correctly classify 29 weight vectors and wrongly classify 0
  Classified 1 matches and 28 non-matches
    Purity of oracle classification:  0.966
    Entropy of oracle classification: 0.216
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 29 weight vectors (classified by oracle) from cluster

Cluster is pure enough and not too large, add its 40 weight vectors to:
  Non-match training set

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 3: Queue length: 1
  Number of manual oracle classifications performed: 96
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (110, 0.5522388059701493, 0.99211169200215, 0.44776119402985076)

Current size of match and non-match training data sets: 31 / 76

Selected cluster with (queue ordering: random):
- Purity 0.55 and entropy 0.99
- Size 110 weight vectors
- Estimated match proportion 0.448

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 110 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 48 matches and 3 non-matches
    Purity of oracle classification:  0.941
    Entropy of oracle classification: 0.323
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

48.0
Analisando o arquivo: diverg(15)538_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 538), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)538_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 331
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 331 weight vectors
  Containing 214 true matches and 117 true non-matches
    (64.65% true matches)
  Identified 297 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   282  (94.95%)
          2 :    12  (4.04%)
          3 :     2  (0.67%)
         19 :     1  (0.34%)

Identified 1 non-pure unique weight vectors (from 297 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 116

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 330
  Number of unique weight vectors: 297

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (297, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 297 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 73

Perform initial selection using "far" method

Farthest first selection of 73 weight vectors from 297 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 34 matches and 39 non-matches
    Purity of oracle classification:  0.534
    Entropy of oracle classification: 0.997
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 224 weight vectors
  Based on 34 matches and 39 non-matches
  Classified 151 matches and 73 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 73
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.5342465753424658, 0.9966132830150964, 0.4657534246575342)
    (73, 0.5342465753424658, 0.9966132830150964, 0.4657534246575342)

Current size of match and non-match training data sets: 34 / 39

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 73 weight vectors
- Estimated match proportion 0.466

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 73 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.800, 1.000, 0.167, 0.180, 0.151, 0.147, 0.203] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 4 matches and 38 non-matches
    Purity of oracle classification:  0.905
    Entropy of oracle classification: 0.454
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)637_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 637), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)637_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)971_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 971), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)971_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 472
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 472 weight vectors
  Containing 223 true matches and 249 true non-matches
    (47.25% true matches)
  Identified 436 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   420  (96.33%)
          2 :    13  (2.98%)
          3 :     2  (0.46%)
         20 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 436 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 248

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 471
  Number of unique weight vectors: 436

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (436, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 436 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 436 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 34 matches and 45 non-matches
    Purity of oracle classification:  0.570
    Entropy of oracle classification: 0.986
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 357 weight vectors
  Based on 34 matches and 45 non-matches
  Classified 148 matches and 209 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.569620253164557, 0.9859690274511927, 0.43037974683544306)
    (209, 0.569620253164557, 0.9859690274511927, 0.43037974683544306)

Current size of match and non-match training data sets: 34 / 45

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.99
- Size 148 weight vectors
- Estimated match proportion 0.430

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 51 matches and 7 non-matches
    Purity of oracle classification:  0.879
    Entropy of oracle classification: 0.531
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)667_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 667), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)667_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 227 true matches and 848 true non-matches
    (21.12% true matches)
  Identified 1018 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   981  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1018 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1018

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1018, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1018 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1018 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 931 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 819 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (819, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 819 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 819 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)475_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 475), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)475_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 333
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 333 weight vectors
  Containing 161 true matches and 172 true non-matches
    (48.35% true matches)
  Identified 317 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   307  (96.85%)
          2 :     7  (2.21%)
          3 :     2  (0.63%)
          6 :     1  (0.32%)

Identified 0 non-pure unique weight vectors (from 317 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 145
     0.000 : 172

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 333
  Number of unique weight vectors: 317

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (317, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 317 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 317 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 42 matches and 32 non-matches
    Purity of oracle classification:  0.568
    Entropy of oracle classification: 0.987
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  32
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 243 weight vectors
  Based on 42 matches and 32 non-matches
  Classified 241 matches and 2 non-matches

  Non-match cluster not large enough for required sample size
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 1
  Number of manual oracle classifications performed: 74
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (241, 0.5675675675675675, 0.9867867202680318, 0.5675675675675675)

Current size of match and non-match training data sets: 42 / 32

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.99
- Size 241 weight vectors
- Estimated match proportion 0.568

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 241 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.817, 1.000, 0.250, 0.212, 0.256, 0.045, 0.250] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.867, 1.000, 0.208, 0.167, 0.194, 0.341, 0.151] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.715, 1.000, 0.214, 0.125, 0.270, 0.214, 0.167] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.224, 0.219, 0.140, 0.209, 0.161] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 36 matches and 32 non-matches
    Purity of oracle classification:  0.529
    Entropy of oracle classification: 0.998
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  32
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(20)767_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 767), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)767_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 667
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 667 weight vectors
  Containing 217 true matches and 450 true non-matches
    (32.53% true matches)
  Identified 630 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   612  (97.14%)
          2 :    15  (2.38%)
          3 :     2  (0.32%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 630 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 447

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 666
  Number of unique weight vectors: 630

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (630, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 630 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 630 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 547 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 135 matches and 412 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (412, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 412 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 412 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 12 matches and 58 non-matches
    Purity of oracle classification:  0.829
    Entropy of oracle classification: 0.661
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)122_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (15, 1 - acm diverg, 122), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)122_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 699
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 699 weight vectors
  Containing 169 true matches and 530 true non-matches
    (24.18% true matches)
  Identified 680 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   667  (98.09%)
          2 :    10  (1.47%)
          3 :     2  (0.29%)
          6 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 680 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.000 : 528

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 699
  Number of unique weight vectors: 680

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (680, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 680 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 680 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 596 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 107 matches and 489 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (107, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (489, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 107 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 107 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 45 matches and 3 non-matches
    Purity of oracle classification:  0.938
    Entropy of oracle classification: 0.337
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(20)998_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 998), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)998_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)450_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.980583
recall                 0.337793
f-measure              0.502488
da                          103
dm                            0
ndm                           0
tp                          101
fp                            2
tn                  4.76529e+07
fn                          198
Name: (10, 1 - acm diverg, 450), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)450_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 797
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 797 weight vectors
  Containing 150 true matches and 647 true non-matches
    (18.82% true matches)
  Identified 763 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   733  (96.07%)
          2 :    27  (3.54%)
          3 :     2  (0.26%)
          4 :     1  (0.13%)

Identified 0 non-pure unique weight vectors (from 763 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 136
     0.000 : 627

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 797
  Number of unique weight vectors: 763

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (763, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 763 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 763 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 678 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 129 matches and 549 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (129, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (549, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 549 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 549 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.783, 0.583, 0.435, 0.765, 0.429] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.474, 0.692, 0.826, 0.484, 0.545] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.571, 0.556, 0.235, 0.429] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 0 matches and 74 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(10)697_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (10, 1 - acm diverg, 697), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)697_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 351
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 351 weight vectors
  Containing 172 true matches and 179 true non-matches
    (49.00% true matches)
  Identified 330 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   318  (96.36%)
          2 :     9  (2.73%)
          3 :     2  (0.61%)
          9 :     1  (0.30%)

Identified 1 non-pure unique weight vectors (from 330 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 153
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 176

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 342
  Number of unique weight vectors: 329

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (329, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 329 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 329 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 31 matches and 43 non-matches
    Purity of oracle classification:  0.581
    Entropy of oracle classification: 0.981
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 255 weight vectors
  Based on 31 matches and 43 non-matches
  Classified 125 matches and 130 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 74
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (125, 0.581081081081081, 0.9809470132751208, 0.4189189189189189)
    (130, 0.581081081081081, 0.9809470132751208, 0.4189189189189189)

Current size of match and non-match training data sets: 31 / 43

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 125 weight vectors
- Estimated match proportion 0.419

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 125 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 48 matches and 6 non-matches
    Purity of oracle classification:  0.889
    Entropy of oracle classification: 0.503
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(10)11_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.977273
recall                 0.431438
f-measure              0.598608
da                          132
dm                            0
ndm                           0
tp                          129
fp                            3
tn                  4.76529e+07
fn                          170
Name: (10, 1 - acm diverg, 11), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)11_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 942
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 942 weight vectors
  Containing 135 true matches and 807 true non-matches
    (14.33% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   879  (96.81%)
          2 :    26  (2.86%)
          3 :     2  (0.22%)
          5 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 121
     0.000 : 787

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 942
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 86 matches and 735 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (86, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (735, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 735 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 735 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.462, 0.889, 0.455, 0.211, 0.375] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.333, 0.667, 0.400, 0.583, 0.563] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 4 matches and 70 non-matches
    Purity of oracle classification:  0.946
    Entropy of oracle classification: 0.303
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

132.0
Analisando o arquivo: diverg(15)204_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 204), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)204_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 637
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 637 weight vectors
  Containing 195 true matches and 442 true non-matches
    (30.61% true matches)
  Identified 610 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   594  (97.38%)
          2 :    13  (2.13%)
          3 :     2  (0.33%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 610 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 439

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 636
  Number of unique weight vectors: 610

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (610, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 610 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 610 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 29 matches and 54 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.934
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 527 weight vectors
  Based on 29 matches and 54 non-matches
  Classified 144 matches and 383 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)
    (383, 0.6506024096385542, 0.9335289015212996, 0.3493975903614458)

Current size of match and non-match training data sets: 29 / 54

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 383 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 383 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.870, 0.619, 0.643, 0.700, 0.524] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 4 matches and 67 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.313
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)563_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 563), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)563_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)308_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 308), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)308_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 434
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 434 weight vectors
  Containing 138 true matches and 296 true non-matches
    (31.80% true matches)
  Identified 417 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   406  (97.36%)
          2 :     8  (1.92%)
          3 :     2  (0.48%)
          6 :     1  (0.24%)

Identified 0 non-pure unique weight vectors (from 417 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 121
     0.000 : 296

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 434
  Number of unique weight vectors: 417

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (417, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 417 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 417 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.786, 0.833, 0.545, 0.478, 0.346] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.364, 0.619, 0.471, 0.600, 0.533] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 18 matches and 60 non-matches
    Purity of oracle classification:  0.769
    Entropy of oracle classification: 0.779
    Number of true matches:      18
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 339 weight vectors
  Based on 18 matches and 60 non-matches
  Classified 107 matches and 232 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (107, 0.7692307692307693, 0.7793498372920852, 0.23076923076923078)
    (232, 0.7692307692307693, 0.7793498372920852, 0.23076923076923078)

Current size of match and non-match training data sets: 18 / 60

Selected cluster with (queue ordering: random):
- Purity 0.77 and entropy 0.78
- Size 107 weight vectors
- Estimated match proportion 0.231

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 107 vectors
  The selected farthest weight vectors are:
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 39 matches and 3 non-matches
    Purity of oracle classification:  0.929
    Entropy of oracle classification: 0.371
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(10)224_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (10, 1 - acm diverg, 224), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)224_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1005
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1005 weight vectors
  Containing 166 true matches and 839 true non-matches
    (16.52% true matches)
  Identified 966 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   937  (97.00%)
          2 :    26  (2.69%)
          3 :     2  (0.21%)
         10 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 966 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 147
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 818

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1004
  Number of unique weight vectors: 966

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (966, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 966 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 966 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 879 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 83 matches and 796 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (83, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (796, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 796 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 796 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 12 matches and 58 non-matches
    Purity of oracle classification:  0.829
    Entropy of oracle classification: 0.661
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(20)485_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 485), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)485_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 855
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 855 weight vectors
  Containing 221 true matches and 634 true non-matches
    (25.85% true matches)
  Identified 799 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   763  (95.49%)
          2 :    33  (4.13%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 799 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 854
  Number of unique weight vectors: 799

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (799, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 799 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 799 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 714 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 564 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (564, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 150 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)376_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 376), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)376_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1070
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1070 weight vectors
  Containing 214 true matches and 856 true non-matches
    (20.00% true matches)
  Identified 1016 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   981  (96.56%)
          2 :    32  (3.15%)
          3 :     2  (0.20%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1016 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 835

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1069
  Number of unique weight vectors: 1016

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1016, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1016 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1016 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 929 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 165 matches and 764 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (165, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (764, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 764 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 78

Farthest first selection of 78 weight vectors from 764 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 1 matches and 77 non-matches
    Purity of oracle classification:  0.987
    Entropy of oracle classification: 0.099
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)255_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 255), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)255_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 429
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 429 weight vectors
  Containing 219 true matches and 210 true non-matches
    (51.05% true matches)
  Identified 393 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   377  (95.93%)
          2 :    13  (3.31%)
          3 :     2  (0.51%)
         20 :     1  (0.25%)

Identified 1 non-pure unique weight vectors (from 393 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 209

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 428
  Number of unique weight vectors: 393

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (393, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 393 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 393 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 40 matches and 37 non-matches
    Purity of oracle classification:  0.519
    Entropy of oracle classification: 0.999
    Number of true matches:      40
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 316 weight vectors
  Based on 40 matches and 37 non-matches
  Classified 147 matches and 169 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.5194805194805194, 0.9989047442823606, 0.5194805194805194)
    (169, 0.5194805194805194, 0.9989047442823606, 0.5194805194805194)

Current size of match and non-match training data sets: 40 / 37

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 169 weight vectors
- Estimated match proportion 0.519

Sample size for this cluster: 61

Farthest first selection of 61 weight vectors from 169 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.146, 0.130, 0.176, 0.318, 0.167] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.717, 1.000, 0.240, 0.231, 0.065, 0.192, 0.184] (False)
    [0.817, 1.000, 0.182, 0.115, 0.154, 0.194, 0.111] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.800, 1.000, 0.259, 0.229, 0.214, 0.258, 0.156] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.850, 1.000, 0.179, 0.205, 0.188, 0.061, 0.180] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.913, 1.000, 0.184, 0.175, 0.087, 0.233, 0.167] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.715, 1.000, 0.214, 0.125, 0.270, 0.214, 0.167] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.758, 1.000, 0.300, 0.140, 0.135, 0.125, 0.148] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 61 weight vectors
  The oracle will correctly classify 61 weight vectors and wrongly classify 0
  Classified 4 matches and 57 non-matches
    Purity of oracle classification:  0.934
    Entropy of oracle classification: 0.349
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 61 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)414_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 414), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)414_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 156 matches and 800 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (800, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 800 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 800 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.067, 0.550, 0.636, 0.500, 0.286] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.333, 0.545, 0.476, 0.727, 0.762] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 4 matches and 71 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.300
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)144_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 144), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)144_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1087
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1087 weight vectors
  Containing 214 true matches and 873 true non-matches
    (19.69% true matches)
  Identified 1033 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   998  (96.61%)
          2 :    32  (3.10%)
          3 :     2  (0.19%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1033 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1086
  Number of unique weight vectors: 1033

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1033, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1033 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1033 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 945 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 98 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 98 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 98 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 42 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(10)553_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.98
recall                 0.327759
f-measure              0.491228
da                          100
dm                            0
ndm                           0
tp                           98
fp                            2
tn                  4.76529e+07
fn                          201
Name: (10, 1 - acm diverg, 553), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)553_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 271
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 271 weight vectors
  Containing 152 true matches and 119 true non-matches
    (56.09% true matches)
  Identified 255 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   243  (95.29%)
          2 :     9  (3.53%)
          3 :     2  (0.78%)
          4 :     1  (0.39%)

Identified 0 non-pure unique weight vectors (from 255 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.000 : 117

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 271
  Number of unique weight vectors: 255

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (255, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 255 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 70

Perform initial selection using "far" method

Farthest first selection of 70 weight vectors from 255 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 31 matches and 39 non-matches
    Purity of oracle classification:  0.557
    Entropy of oracle classification: 0.991
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  39
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 185 weight vectors
  Based on 31 matches and 39 non-matches
  Classified 113 matches and 72 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 70
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (113, 0.5571428571428572, 0.9905577004075261, 0.44285714285714284)
    (72, 0.5571428571428572, 0.9905577004075261, 0.44285714285714284)

Current size of match and non-match training data sets: 31 / 39

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 113 weight vectors
- Estimated match proportion 0.443

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 113 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 46 matches and 6 non-matches
    Purity of oracle classification:  0.885
    Entropy of oracle classification: 0.516
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(10)502_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 502), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)502_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 390
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 390 weight vectors
  Containing 209 true matches and 181 true non-matches
    (53.59% true matches)
  Identified 359 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   345  (96.10%)
          2 :    11  (3.06%)
          3 :     2  (0.56%)
         17 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 359 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 180

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 389
  Number of unique weight vectors: 359

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (359, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 359 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 359 vectors
  The selected farthest weight vectors are:
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 43 matches and 33 non-matches
    Purity of oracle classification:  0.566
    Entropy of oracle classification: 0.987
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  33
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 283 weight vectors
  Based on 43 matches and 33 non-matches
  Classified 140 matches and 143 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.5657894736842105, 0.9874750082985964, 0.5657894736842105)
    (143, 0.5657894736842105, 0.9874750082985964, 0.5657894736842105)

Current size of match and non-match training data sets: 43 / 33

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.99
- Size 140 weight vectors
- Estimated match proportion 0.566

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 48 matches and 9 non-matches
    Purity of oracle classification:  0.842
    Entropy of oracle classification: 0.629
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)817_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 817), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)817_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 220 true matches and 856 true non-matches
    (20.45% true matches)
  Identified 1020 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   984  (96.47%)
          2 :    33  (3.24%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1020 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 835

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1020

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1020, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1020 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1020 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 933 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 168 matches and 765 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (168, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (765, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 168 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 168 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 44 matches and 13 non-matches
    Purity of oracle classification:  0.772
    Entropy of oracle classification: 0.775
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  13
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)814_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 814), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)814_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 738
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 738 weight vectors
  Containing 223 true matches and 515 true non-matches
    (30.22% true matches)
  Identified 702 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   686  (97.72%)
          2 :    13  (1.85%)
          3 :     2  (0.28%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 702 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 514

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 737
  Number of unique weight vectors: 702

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (702, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 702 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 702 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 618 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 135 matches and 483 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (483, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 483 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 483 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.455, 0.714, 0.429, 0.550, 0.647] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 12 matches and 59 non-matches
    Purity of oracle classification:  0.831
    Entropy of oracle classification: 0.655
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)215_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976378
recall                 0.414716
f-measure               0.58216
da                          127
dm                            0
ndm                           0
tp                          124
fp                            3
tn                  4.76529e+07
fn                          175
Name: (10, 1 - acm diverg, 215), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)215_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 905
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 905 weight vectors
  Containing 139 true matches and 766 true non-matches
    (15.36% true matches)
  Identified 871 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   842  (96.67%)
          2 :    26  (2.99%)
          3 :     2  (0.23%)
          5 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 871 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 125
     0.000 : 746

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 905
  Number of unique weight vectors: 871

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (871, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 871 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 871 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 30 matches and 56 non-matches
    Purity of oracle classification:  0.651
    Entropy of oracle classification: 0.933
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 785 weight vectors
  Based on 30 matches and 56 non-matches
  Classified 236 matches and 549 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (236, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)
    (549, 0.6511627906976745, 0.9330252953592911, 0.3488372093023256)

Current size of match and non-match training data sets: 30 / 56

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 549 weight vectors
- Estimated match proportion 0.349

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 549 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.846, 0.857, 0.353, 0.318, 0.400] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.500, 0.875, 0.455, 0.333, 0.429] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [1.000, 0.000, 0.333, 0.667, 0.400, 0.583, 0.563] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 0 matches and 75 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  75
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

127.0
Analisando o arquivo: diverg(10)351_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 351), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)351_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 586
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 586 weight vectors
  Containing 186 true matches and 400 true non-matches
    (31.74% true matches)
  Identified 546 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   512  (93.77%)
          2 :    31  (5.68%)
          3 :     2  (0.37%)
          6 :     1  (0.18%)

Identified 0 non-pure unique weight vectors (from 546 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.000 : 380

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 586
  Number of unique weight vectors: 546

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (546, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 546 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 546 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 32 matches and 49 non-matches
    Purity of oracle classification:  0.605
    Entropy of oracle classification: 0.968
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 465 weight vectors
  Based on 32 matches and 49 non-matches
  Classified 156 matches and 309 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6049382716049383, 0.9679884922470297, 0.3950617283950617)
    (309, 0.6049382716049383, 0.9679884922470297, 0.3950617283950617)

Current size of match and non-match training data sets: 32 / 49

Selected cluster with (queue ordering: random):
- Purity 0.60 and entropy 0.97
- Size 309 weight vectors
- Estimated match proportion 0.395

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 309 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.556, 0.348, 0.467, 0.636, 0.412] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.800, 0.667, 0.381, 0.550, 0.429] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.800, 0.000, 0.444, 0.545, 0.333, 0.111, 0.533] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.462, 0.667, 0.636, 0.368, 0.500] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(15)986_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 986), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)986_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1071
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1071 weight vectors
  Containing 226 true matches and 845 true non-matches
    (21.10% true matches)
  Identified 1014 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   977  (96.35%)
          2 :    34  (3.35%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1014 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 824

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1070
  Number of unique weight vectors: 1014

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1014, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1014 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1014 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 927 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 331 matches and 596 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (331, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (596, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 596 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 596 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.692, 0.583, 0.500, 0.750, 0.731] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)422_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 422), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)422_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 177 matches and 761 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (177, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (761, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 177 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 177 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 44 matches and 14 non-matches
    Purity of oracle classification:  0.759
    Entropy of oracle classification: 0.797
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  14
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)793_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (15, 1 - acm diverg, 793), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)793_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 885
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 885 weight vectors
  Containing 177 true matches and 708 true non-matches
    (20.00% true matches)
  Identified 846 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   816  (96.45%)
          2 :    27  (3.19%)
          3 :     2  (0.24%)
          9 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 846 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 158
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 687

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 876
  Number of unique weight vectors: 845

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (845, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 845 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 845 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 759 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 140 matches and 619 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (619, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 140 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 140 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 42 matches and 11 non-matches
    Purity of oracle classification:  0.792
    Entropy of oracle classification: 0.737
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(15)461_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 461), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)461_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 731
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 731 weight vectors
  Containing 217 true matches and 514 true non-matches
    (29.69% true matches)
  Identified 696 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   681  (97.84%)
          2 :    12  (1.72%)
          3 :     2  (0.29%)
         20 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 696 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 513

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 730
  Number of unique weight vectors: 696

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (696, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 696 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 696 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 612 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 129 matches and 483 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (129, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (483, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 483 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 483 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.692, 0.692, 0.727, 0.710, 0.250] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.455, 0.714, 0.429, 0.550, 0.647] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 12 matches and 59 non-matches
    Purity of oracle classification:  0.831
    Entropy of oracle classification: 0.655
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)54_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 54), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)54_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 543 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 543 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 12 matches and 61 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.645
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)112_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 112), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)112_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 667
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 667 weight vectors
  Containing 217 true matches and 450 true non-matches
    (32.53% true matches)
  Identified 630 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   612  (97.14%)
          2 :    15  (2.38%)
          3 :     2  (0.32%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 630 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 447

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 666
  Number of unique weight vectors: 630

Time to load and analyse the weight vector file: 0.04 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (630, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 630 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 630 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 547 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 135 matches and 412 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (412, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 412 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 412 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 12 matches and 58 non-matches
    Purity of oracle classification:  0.829
    Entropy of oracle classification: 0.661
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)757_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 757), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)757_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 502
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 502 weight vectors
  Containing 189 true matches and 313 true non-matches
    (37.65% true matches)
  Identified 474 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   461  (97.26%)
          2 :    10  (2.11%)
          3 :     2  (0.42%)
         15 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 474 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 161
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 312

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 501
  Number of unique weight vectors: 474

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (474, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 474 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 474 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 28 matches and 52 non-matches
    Purity of oracle classification:  0.650
    Entropy of oracle classification: 0.934
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 394 weight vectors
  Based on 28 matches and 52 non-matches
  Classified 138 matches and 256 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.65, 0.934068055375491, 0.35)
    (256, 0.65, 0.934068055375491, 0.35)

Current size of match and non-match training data sets: 28 / 52

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 138 weight vectors
- Estimated match proportion 0.350

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 138 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 49 matches and 5 non-matches
    Purity of oracle classification:  0.907
    Entropy of oracle classification: 0.445
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)769_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 769), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)769_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 770
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 770 weight vectors
  Containing 207 true matches and 563 true non-matches
    (26.88% true matches)
  Identified 741 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   724  (97.71%)
          2 :    14  (1.89%)
          3 :     2  (0.27%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 741 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 560

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 769
  Number of unique weight vectors: 741

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (741, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 741 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 741 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 35 matches and 50 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.977
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 656 weight vectors
  Based on 35 matches and 50 non-matches
  Classified 152 matches and 504 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.5882352941176471, 0.9774178175281716, 0.4117647058823529)
    (504, 0.5882352941176471, 0.9774178175281716, 0.4117647058823529)

Current size of match and non-match training data sets: 35 / 50

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 504 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 78

Farthest first selection of 78 weight vectors from 504 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.500, 0.565, 0.857, 0.538, 0.786] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 2 matches and 76 non-matches
    Purity of oracle classification:  0.974
    Entropy of oracle classification: 0.172
    Number of true matches:      2
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)162_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 162), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)162_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 620
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 620 weight vectors
  Containing 190 true matches and 430 true non-matches
    (30.65% true matches)
  Identified 580 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   546  (94.14%)
          2 :    31  (5.34%)
          3 :     2  (0.34%)
          6 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 580 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.000 : 410

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 620
  Number of unique weight vectors: 580

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (580, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 580 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 580 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 498 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 142 matches and 356 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (356, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 356 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 356 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.714, 0.727, 0.750, 0.294, 0.833] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 4 matches and 64 non-matches
    Purity of oracle classification:  0.941
    Entropy of oracle classification: 0.323
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(10)848_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (10, 1 - acm diverg, 848), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)848_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 967
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 967 weight vectors
  Containing 143 true matches and 824 true non-matches
    (14.79% true matches)
  Identified 933 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   904  (96.89%)
          2 :    26  (2.79%)
          3 :     2  (0.21%)
          5 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 933 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 129
     0.000 : 804

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 967
  Number of unique weight vectors: 933

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (933, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 933 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 933 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 27 matches and 60 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.894
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 846 weight vectors
  Based on 27 matches and 60 non-matches
  Classified 90 matches and 756 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (90, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)
    (756, 0.6896551724137931, 0.8935711016541907, 0.3103448275862069)

Current size of match and non-match training data sets: 27 / 60

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 756 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 756 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 5 matches and 69 non-matches
    Purity of oracle classification:  0.932
    Entropy of oracle classification: 0.357
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(10)822_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 822), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)822_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 708
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 708 weight vectors
  Containing 207 true matches and 501 true non-matches
    (29.24% true matches)
  Identified 674 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   657  (97.48%)
          2 :    14  (2.08%)
          3 :     2  (0.30%)
         17 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 674 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 498

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 707
  Number of unique weight vectors: 674

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (674, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 674 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 674 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 35 matches and 49 non-matches
    Purity of oracle classification:  0.583
    Entropy of oracle classification: 0.980
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 590 weight vectors
  Based on 35 matches and 49 non-matches
  Classified 278 matches and 312 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (278, 0.5833333333333334, 0.9798687566511527, 0.4166666666666667)
    (312, 0.5833333333333334, 0.9798687566511527, 0.4166666666666667)

Current size of match and non-match training data sets: 35 / 49

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 312 weight vectors
- Estimated match proportion 0.417

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 312 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.700, 0.645, 0.316, 0.455, 0.714] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.667, 0.857, 0.353, 0.632, 0.550] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [0.667, 0.000, 0.800, 0.684, 0.667, 0.529, 0.609] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.950, 0.000, 0.619, 0.800, 0.478, 0.280, 0.625] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.611, 0.000, 0.800, 0.684, 0.500, 0.778, 0.609] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.533, 0.000, 0.577, 0.783, 0.429, 0.615, 0.478] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.600, 0.700, 0.600, 0.611, 0.706] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 0 matches and 72 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)701_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 701), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)701_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 752
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 752 weight vectors
  Containing 222 true matches and 530 true non-matches
    (29.52% true matches)
  Identified 716 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   697  (97.35%)
          2 :    16  (2.23%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 716 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 527

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 751
  Number of unique weight vectors: 716

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (716, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 716 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 716 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 632 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 146 matches and 486 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (486, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 486 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 486 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 8 matches and 67 non-matches
    Purity of oracle classification:  0.893
    Entropy of oracle classification: 0.490
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)70_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 70), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)70_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)238_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 238), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)238_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 645
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 645 weight vectors
  Containing 200 true matches and 445 true non-matches
    (31.01% true matches)
  Identified 596 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   562  (94.30%)
          2 :    31  (5.20%)
          3 :     2  (0.34%)
         15 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 596 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 424

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 644
  Number of unique weight vectors: 596

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (596, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 596 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 596 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 28 matches and 54 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 514 weight vectors
  Based on 28 matches and 54 non-matches
  Classified 169 matches and 345 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (169, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)
    (345, 0.6585365853658537, 0.9262122127346665, 0.34146341463414637)

Current size of match and non-match training data sets: 28 / 54

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 169 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 169 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 44 matches and 13 non-matches
    Purity of oracle classification:  0.772
    Entropy of oracle classification: 0.775
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  13
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)821_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 821), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)821_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 732
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 732 weight vectors
  Containing 219 true matches and 513 true non-matches
    (29.92% true matches)
  Identified 677 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   641  (94.68%)
          2 :    33  (4.87%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 677 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 492

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 731
  Number of unique weight vectors: 677

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (677, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 677 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 677 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 593 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 148 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (445, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 445 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 445 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 8 matches and 62 non-matches
    Purity of oracle classification:  0.886
    Entropy of oracle classification: 0.513
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(10)460_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 460), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)460_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 472
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 472 weight vectors
  Containing 223 true matches and 249 true non-matches
    (47.25% true matches)
  Identified 436 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   420  (96.33%)
          2 :    13  (2.98%)
          3 :     2  (0.46%)
         20 :     1  (0.23%)

Identified 1 non-pure unique weight vectors (from 436 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 187
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 248

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 471
  Number of unique weight vectors: 436

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (436, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 436 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 436 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 34 matches and 45 non-matches
    Purity of oracle classification:  0.570
    Entropy of oracle classification: 0.986
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 357 weight vectors
  Based on 34 matches and 45 non-matches
  Classified 148 matches and 209 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.569620253164557, 0.9859690274511927, 0.43037974683544306)
    (209, 0.569620253164557, 0.9859690274511927, 0.43037974683544306)

Current size of match and non-match training data sets: 34 / 45

Selected cluster with (queue ordering: random):
- Purity 0.57 and entropy 0.99
- Size 148 weight vectors
- Estimated match proportion 0.430

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 51 matches and 7 non-matches
    Purity of oracle classification:  0.879
    Entropy of oracle classification: 0.531
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)708_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 708), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)708_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 487
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 487 weight vectors
  Containing 182 true matches and 305 true non-matches
    (37.37% true matches)
  Identified 462 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   451  (97.62%)
          2 :     8  (1.73%)
          3 :     2  (0.43%)
         14 :     1  (0.22%)

Identified 1 non-pure unique weight vectors (from 462 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 157
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 304

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 486
  Number of unique weight vectors: 462

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (462, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 462 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 462 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 30 matches and 49 non-matches
    Purity of oracle classification:  0.620
    Entropy of oracle classification: 0.958
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 383 weight vectors
  Based on 30 matches and 49 non-matches
  Classified 129 matches and 254 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (129, 0.620253164556962, 0.9578630237479795, 0.379746835443038)
    (254, 0.620253164556962, 0.9578630237479795, 0.379746835443038)

Current size of match and non-match training data sets: 30 / 49

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 129 weight vectors
- Estimated match proportion 0.380

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 129 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.867, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 48 matches and 5 non-matches
    Purity of oracle classification:  0.906
    Entropy of oracle classification: 0.451
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

65.0
Analisando o arquivo: diverg(10)225_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987342
recall                  0.26087
f-measure              0.412698
da                           79
dm                            0
ndm                           0
tp                           78
fp                            1
tn                  4.76529e+07
fn                          221
Name: (10, 1 - acm diverg, 225), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)225_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 802
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 802 weight vectors
  Containing 188 true matches and 614 true non-matches
    (23.44% true matches)
  Identified 760 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   729  (95.92%)
          2 :    28  (3.68%)
          3 :     2  (0.26%)
         11 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 760 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 593

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 801
  Number of unique weight vectors: 760

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (760, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 760 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 675 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 136 matches and 539 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (539, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 136 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 45 matches and 8 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

79.0
Analisando o arquivo: diverg(15)616_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 616), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)616_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 831
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 831 weight vectors
  Containing 227 true matches and 604 true non-matches
    (27.32% true matches)
  Identified 774 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   737  (95.22%)
          2 :    34  (4.39%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 774 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 583

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 830
  Number of unique weight vectors: 774

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (774, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 774 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 774 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 689 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 151 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (538, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 151 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 51 matches and 3 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)117_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 117), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)117_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 515
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 515 weight vectors
  Containing 190 true matches and 325 true non-matches
    (36.89% true matches)
  Identified 487 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   474  (97.33%)
          2 :    10  (2.05%)
          3 :     2  (0.41%)
         15 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 487 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 162
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 324

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 514
  Number of unique weight vectors: 487

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (487, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 487 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 487 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 31 matches and 49 non-matches
    Purity of oracle classification:  0.613
    Entropy of oracle classification: 0.963
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  49
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 407 weight vectors
  Based on 31 matches and 49 non-matches
  Classified 141 matches and 266 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6125, 0.9631672450918832, 0.3875)
    (266, 0.6125, 0.9631672450918832, 0.3875)

Current size of match and non-match training data sets: 31 / 49

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 266 weight vectors
- Estimated match proportion 0.388

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 266 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.875, 0.484, 0.474, 0.417, 0.524] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.864, 0.667, 0.435, 0.700, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.565, 0.737, 0.588, 0.727, 0.762] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.846, 0.857, 0.353, 0.318, 0.400] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.680, 0.000, 0.609, 0.737, 0.600, 0.529, 0.696] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.615, 0.714, 0.353, 0.583, 0.571] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 0 matches and 68 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(20)241_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 241), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)241_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 26 matches and 62 non-matches
    Purity of oracle classification:  0.705
    Entropy of oracle classification: 0.876
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 26 matches and 62 non-matches
  Classified 119 matches and 829 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (119, 0.7045454545454546, 0.8756633923230397, 0.29545454545454547)
    (829, 0.7045454545454546, 0.8756633923230397, 0.29545454545454547)

Current size of match and non-match training data sets: 26 / 62

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 119 weight vectors
- Estimated match proportion 0.295

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 119 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)780_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 780), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)780_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 753
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 753 weight vectors
  Containing 198 true matches and 555 true non-matches
    (26.29% true matches)
  Identified 711 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   676  (95.08%)
          2 :    32  (4.50%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 711 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 535

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 753
  Number of unique weight vectors: 711

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (711, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 711 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 711 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 26 matches and 58 non-matches
    Purity of oracle classification:  0.690
    Entropy of oracle classification: 0.893
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 627 weight vectors
  Based on 26 matches and 58 non-matches
  Classified 126 matches and 501 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (126, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)
    (501, 0.6904761904761905, 0.8926230133850986, 0.30952380952380953)

Current size of match and non-match training data sets: 26 / 58

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.89
- Size 126 weight vectors
- Estimated match proportion 0.310

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 126 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 48 matches and 2 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  2
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(20)247_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 247), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)247_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 806
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 806 weight vectors
  Containing 226 true matches and 580 true non-matches
    (28.04% true matches)
  Identified 767 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   748  (97.52%)
          2 :    16  (2.09%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 767 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 577

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 805
  Number of unique weight vectors: 767

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (767, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 767 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 767 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 682 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 541 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (541, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 541 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 541 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.786, 0.591, 0.273, 0.522, 0.450] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 12 matches and 61 non-matches
    Purity of oracle classification:  0.836
    Entropy of oracle classification: 0.645
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)252_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985915
recall                 0.234114
f-measure              0.378378
da                           71
dm                            0
ndm                           0
tp                           70
fp                            1
tn                  4.76529e+07
fn                          229
Name: (10, 1 - acm diverg, 252), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)252_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 872
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 872 weight vectors
  Containing 186 true matches and 686 true non-matches
    (21.33% true matches)
  Identified 832 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   798  (95.91%)
          2 :    31  (3.73%)
          3 :     2  (0.24%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 832 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.000 : 666

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 872
  Number of unique weight vectors: 832

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (832, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 832 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 832 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 746 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 158 matches and 588 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (588, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 158 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 158 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.929, 1.000, 0.182, 0.238, 0.188, 0.146, 0.270] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 42 matches and 13 non-matches
    Purity of oracle classification:  0.764
    Entropy of oracle classification: 0.789
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  13
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

71.0
Analisando o arquivo: diverg(10)596_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979592
recall                  0.32107
f-measure              0.483627
da                           98
dm                            0
ndm                           0
tp                           96
fp                            2
tn                  4.76529e+07
fn                          203
Name: (10, 1 - acm diverg, 596), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)596_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 724
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 724 weight vectors
  Containing 168 true matches and 556 true non-matches
    (23.20% true matches)
  Identified 687 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   656  (95.49%)
          2 :    28  (4.08%)
          3 :     2  (0.29%)
          6 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 687 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 151
     0.000 : 536

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 724
  Number of unique weight vectors: 687

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (687, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 687 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 687 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 603 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 264 matches and 339 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (264, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (339, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 339 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 339 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.367, 0.667, 0.583, 0.625, 0.316] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.438, 0.500, 0.467, 0.529, 0.611] (False)
    [1.000, 0.000, 0.667, 0.500, 0.524, 0.786, 0.524] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [1.000, 0.000, 0.818, 0.727, 0.438, 0.375, 0.400] (False)
    [0.857, 0.000, 0.500, 0.389, 0.235, 0.045, 0.526] (False)
    [1.000, 0.000, 0.476, 0.179, 0.500, 0.412, 0.357] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 0.000, 0.833, 0.571, 0.727, 0.647, 0.857] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.583, 0.875, 0.727, 0.833, 0.643] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 0 matches and 71 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

98.0
Analisando o arquivo: diverg(20)159_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (20, 1 - acm diverg, 159), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)159_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1087
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1087 weight vectors
  Containing 214 true matches and 873 true non-matches
    (19.69% true matches)
  Identified 1033 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   998  (96.61%)
          2 :    32  (3.10%)
          3 :     2  (0.19%)
         19 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1033 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1086
  Number of unique weight vectors: 1033

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1033, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1033 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1033 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 945 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 98 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (98, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)390_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 390), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)390_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 218 true matches and 735 true non-matches
    (22.88% true matches)
  Identified 898 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   862  (95.99%)
          2 :    33  (3.67%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 898 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 714

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 898

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (898, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 898 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 898 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 812 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 157 matches and 655 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (655, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 655 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 655 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 3 matches and 72 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)365_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 365), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)365_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1035
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1035 weight vectors
  Containing 223 true matches and 812 true non-matches
    (21.55% true matches)
  Identified 981 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   944  (96.23%)
          2 :    34  (3.47%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 981 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 791

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1034
  Number of unique weight vectors: 981

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (981, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 981 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 981 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 894 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 160 matches and 734 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (160, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (734, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 734 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 734 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 3 matches and 74 non-matches
    Purity of oracle classification:  0.961
    Entropy of oracle classification: 0.238
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)222_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 222), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)222_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 605
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 605 weight vectors
  Containing 212 true matches and 393 true non-matches
    (35.04% true matches)
  Identified 571 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   553  (96.85%)
          2 :    15  (2.63%)
          3 :     2  (0.35%)
         16 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 571 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 390

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 604
  Number of unique weight vectors: 571

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (571, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 571 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 571 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 31 matches and 51 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 489 weight vectors
  Based on 31 matches and 51 non-matches
  Classified 151 matches and 338 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)
    (338, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)

Current size of match and non-match training data sets: 31 / 51

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 338 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 338 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.833, 0.833, 0.550, 0.500, 0.688] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.474, 0.692, 0.826, 0.484, 0.545] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.700, 0.536, 0.353, 0.647, 0.571] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.500, 0.452, 0.632, 0.714, 0.667] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.826, 0.286, 0.857, 0.643] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 5 matches and 66 non-matches
    Purity of oracle classification:  0.930
    Entropy of oracle classification: 0.367
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)49_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984615
recall                 0.214047
f-measure              0.351648
da                           65
dm                            0
ndm                           0
tp                           64
fp                            1
tn                  4.76529e+07
fn                          235
Name: (10, 1 - acm diverg, 49), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)49_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 591
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 591 weight vectors
  Containing 190 true matches and 401 true non-matches
    (32.15% true matches)
  Identified 544 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   511  (93.93%)
          2 :    30  (5.51%)
          3 :     2  (0.37%)
         14 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 544 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 163
     0.929 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 380

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 590
  Number of unique weight vectors: 544

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (544, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 544 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 544 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 28 matches and 53 non-matches
    Purity of oracle classification:  0.654
    Entropy of oracle classification: 0.930
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 463 weight vectors
  Based on 28 matches and 53 non-matches
  Classified 156 matches and 307 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.654320987654321, 0.9301497323974337, 0.345679012345679)
    (307, 0.654320987654321, 0.9301497323974337, 0.345679012345679)

Current size of match and non-match training data sets: 28 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 307 weight vectors
- Estimated match proportion 0.346

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 307 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.000, 0.600, 0.818, 0.571, 0.524] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.556, 0.348, 0.467, 0.636, 0.412] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.538, 0.600, 0.471, 0.632, 0.688] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [0.800, 0.000, 0.444, 0.545, 0.333, 0.111, 0.533] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.462, 0.667, 0.636, 0.368, 0.500] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.875, 0.778, 0.471, 0.706, 0.714] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 0 matches and 68 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  68
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

65.0
Analisando o arquivo: diverg(10)904_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 904), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)904_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 496
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 496 weight vectors
  Containing 205 true matches and 291 true non-matches
    (41.33% true matches)
  Identified 467 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   450  (96.36%)
          2 :    14  (3.00%)
          3 :     2  (0.43%)
         12 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 467 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 288

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 495
  Number of unique weight vectors: 467

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (467, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 467 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 467 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 35 matches and 44 non-matches
    Purity of oracle classification:  0.557
    Entropy of oracle classification: 0.991
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 388 weight vectors
  Based on 35 matches and 44 non-matches
  Classified 149 matches and 239 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.5569620253164557, 0.9906174973781801, 0.4430379746835443)
    (239, 0.5569620253164557, 0.9906174973781801, 0.4430379746835443)

Current size of match and non-match training data sets: 35 / 44

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 239 weight vectors
- Estimated match proportion 0.443

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 239 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [0.890, 1.000, 0.281, 0.136, 0.183, 0.250, 0.163] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 4 matches and 64 non-matches
    Purity of oracle classification:  0.941
    Entropy of oracle classification: 0.323
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)689_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 689), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)689_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 682
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 682 weight vectors
  Containing 201 true matches and 481 true non-matches
    (29.47% true matches)
  Identified 637 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   603  (94.66%)
          2 :    31  (4.87%)
          3 :     2  (0.31%)
         11 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 637 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 460

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 681
  Number of unique weight vectors: 637

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (637, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 637 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 637 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 26 matches and 57 non-matches
    Purity of oracle classification:  0.687
    Entropy of oracle classification: 0.897
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 554 weight vectors
  Based on 26 matches and 57 non-matches
  Classified 129 matches and 425 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (129, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)
    (425, 0.6867469879518072, 0.8968928834064589, 0.3132530120481928)

Current size of match and non-match training data sets: 26 / 57

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 425 weight vectors
- Estimated match proportion 0.313

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 425 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 13 matches and 56 non-matches
    Purity of oracle classification:  0.812
    Entropy of oracle classification: 0.698
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)389_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 389), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)389_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 489
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 489 weight vectors
  Containing 222 true matches and 267 true non-matches
    (45.40% true matches)
  Identified 453 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   434  (95.81%)
          2 :    16  (3.53%)
          3 :     2  (0.44%)
         17 :     1  (0.22%)

Identified 1 non-pure unique weight vectors (from 453 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 264

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 488
  Number of unique weight vectors: 453

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (453, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 453 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 79

Perform initial selection using "far" method

Farthest first selection of 79 weight vectors from 453 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 79 weight vectors
  The oracle will correctly classify 79 weight vectors and wrongly classify 0
  Classified 35 matches and 44 non-matches
    Purity of oracle classification:  0.557
    Entropy of oracle classification: 0.991
    Number of true matches:      35
    Number of false matches:     0
    Number of true non-matches:  44
    Number of false non-matches: 0

Deleted 79 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 374 weight vectors
  Based on 35 matches and 44 non-matches
  Classified 149 matches and 225 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 79
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.5569620253164557, 0.9906174973781801, 0.4430379746835443)
    (225, 0.5569620253164557, 0.9906174973781801, 0.4430379746835443)

Current size of match and non-match training data sets: 35 / 44

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 225 weight vectors
- Estimated match proportion 0.443

Sample size for this cluster: 67

Farthest first selection of 67 weight vectors from 225 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 67 weight vectors
  The oracle will correctly classify 67 weight vectors and wrongly classify 0
  Classified 6 matches and 61 non-matches
    Purity of oracle classification:  0.910
    Entropy of oracle classification: 0.435
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 67 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)788_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 788), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)788_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 314
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 314 weight vectors
  Containing 195 true matches and 119 true non-matches
    (62.10% true matches)
  Identified 283 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   267  (94.35%)
          2 :    13  (4.59%)
          3 :     2  (0.71%)
         15 :     1  (0.35%)

Identified 1 non-pure unique weight vectors (from 283 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 166
     0.933 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 116

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 313
  Number of unique weight vectors: 283

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (283, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 283 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 72

Perform initial selection using "far" method

Farthest first selection of 72 weight vectors from 283 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.911, 1.000, 0.097, 0.025, 0.075, 0.288, 0.486] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.435, 0.786, 0.800, 0.588, 0.810] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 34 matches and 38 non-matches
    Purity of oracle classification:  0.528
    Entropy of oracle classification: 0.998
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 211 weight vectors
  Based on 34 matches and 38 non-matches
  Classified 138 matches and 73 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 72
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.5277777777777778, 0.9977724720899821, 0.4722222222222222)
    (73, 0.5277777777777778, 0.9977724720899821, 0.4722222222222222)

Current size of match and non-match training data sets: 34 / 38

Selected cluster with (queue ordering: random):
- Purity 0.53 and entropy 1.00
- Size 73 weight vectors
- Estimated match proportion 0.472

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 73 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 0 matches and 42 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  42
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(10)207_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (10, 1 - acm diverg, 207), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)207_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 458
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 458 weight vectors
  Containing 210 true matches and 248 true non-matches
    (45.85% true matches)
  Identified 425 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   411  (96.71%)
          2 :    11  (2.59%)
          3 :     2  (0.47%)
         19 :     1  (0.24%)

Identified 1 non-pure unique weight vectors (from 425 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 177
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 247

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 457
  Number of unique weight vectors: 425

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (425, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 425 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 425 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 33 matches and 45 non-matches
    Purity of oracle classification:  0.577
    Entropy of oracle classification: 0.983
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 347 weight vectors
  Based on 33 matches and 45 non-matches
  Classified 138 matches and 209 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (138, 0.5769230769230769, 0.9828586897127056, 0.4230769230769231)
    (209, 0.5769230769230769, 0.9828586897127056, 0.4230769230769231)

Current size of match and non-match training data sets: 33 / 45

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 138 weight vectors
- Estimated match proportion 0.423

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 138 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.958, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(20)738_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 738), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)738_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 548
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 548 weight vectors
  Containing 226 true matches and 322 true non-matches
    (41.24% true matches)
  Identified 509 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   490  (96.27%)
          2 :    16  (3.14%)
          3 :     2  (0.39%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 509 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 319

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 547
  Number of unique weight vectors: 509

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (509, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 509 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 509 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 33 matches and 48 non-matches
    Purity of oracle classification:  0.593
    Entropy of oracle classification: 0.975
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 428 weight vectors
  Based on 33 matches and 48 non-matches
  Classified 152 matches and 276 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)
    (276, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)

Current size of match and non-match training data sets: 33 / 48

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 152 weight vectors
- Estimated match proportion 0.407

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 152 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 53 matches and 5 non-matches
    Purity of oracle classification:  0.914
    Entropy of oracle classification: 0.424
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)770_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976562
recall                  0.41806
f-measure               0.58548
da                          128
dm                            0
ndm                           0
tp                          125
fp                            3
tn                  4.76529e+07
fn                          174
Name: (10, 1 - acm diverg, 770), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)770_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 586
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 586 weight vectors
  Containing 134 true matches and 452 true non-matches
    (22.87% true matches)
  Identified 573 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   565  (98.60%)
          2 :     5  (0.87%)
          3 :     2  (0.35%)
          5 :     1  (0.17%)

Identified 0 non-pure unique weight vectors (from 573 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 121
     0.000 : 452

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 586
  Number of unique weight vectors: 573

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (573, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 573 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 573 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.769, 0.850, 0.353, 0.500, 0.750] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.481, 0.474, 0.471, 0.773, 0.450] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 31 matches and 51 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 491 weight vectors
  Based on 31 matches and 51 non-matches
  Classified 90 matches and 401 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (90, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)
    (401, 0.6219512195121951, 0.956652272148091, 0.3780487804878049)

Current size of match and non-match training data sets: 31 / 51

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 90 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 90 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 39 matches and 6 non-matches
    Purity of oracle classification:  0.867
    Entropy of oracle classification: 0.567
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

128.0
Analisando o arquivo: diverg(15)943_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 943), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)943_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 877
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 877 weight vectors
  Containing 214 true matches and 663 true non-matches
    (24.40% true matches)
  Identified 825 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   789  (95.64%)
          2 :    33  (4.00%)
          3 :     2  (0.24%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 825 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 642

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 876
  Number of unique weight vectors: 825

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (825, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 825 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 825 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 739 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 181 matches and 558 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (181, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (558, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 558 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 558 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 0 matches and 74 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(20)453_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (20, 1 - acm diverg, 453), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)453_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1026
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1026 weight vectors
  Containing 198 true matches and 828 true non-matches
    (19.30% true matches)
  Identified 984 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   949  (96.44%)
          2 :    32  (3.25%)
          3 :     2  (0.20%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 984 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 808

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1026
  Number of unique weight vectors: 984

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (984, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 984 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 984 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 897 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 93 matches and 804 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (93, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (804, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 804 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 804 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)230_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 230), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)230_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 742
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 742 weight vectors
  Containing 163 true matches and 579 true non-matches
    (21.97% true matches)
  Identified 721 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   710  (98.47%)
          2 :     8  (1.11%)
          3 :     2  (0.28%)
         10 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 721 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 144
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 576

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 741
  Number of unique weight vectors: 721

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (721, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 721 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 721 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 637 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 106 matches and 531 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (531, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 106 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 106 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 43 matches and 5 non-matches
    Purity of oracle classification:  0.896
    Entropy of oracle classification: 0.482
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(15)828_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 828), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)828_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 803
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 803 weight vectors
  Containing 226 true matches and 577 true non-matches
    (28.14% true matches)
  Identified 746 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   709  (95.04%)
          2 :    34  (4.56%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 746 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 556

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 802
  Number of unique weight vectors: 746

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (746, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 746 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 746 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 32 matches and 53 non-matches
    Purity of oracle classification:  0.624
    Entropy of oracle classification: 0.956
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 661 weight vectors
  Based on 32 matches and 53 non-matches
  Classified 331 matches and 330 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (331, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)
    (330, 0.6235294117647059, 0.9555111232924128, 0.3764705882352941)

Current size of match and non-match training data sets: 32 / 53

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 331 weight vectors
- Estimated match proportion 0.376

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 331 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 41 matches and 30 non-matches
    Purity of oracle classification:  0.577
    Entropy of oracle classification: 0.983
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  30
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)959_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 959), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)959_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 979
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 979 weight vectors
  Containing 195 true matches and 784 true non-matches
    (19.92% true matches)
  Identified 937 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   902  (96.26%)
          2 :    32  (3.42%)
          3 :     2  (0.21%)
          7 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 937 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.000 : 764

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 979
  Number of unique weight vectors: 937

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (937, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 937 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 937 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.500, 0.286, 0.333, 0.222, 0.179] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 850 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 151 matches and 699 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (699, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 699 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 699 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.909, 0.700, 0.500, 0.306, 0.824] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 1 matches and 74 non-matches
    Purity of oracle classification:  0.987
    Entropy of oracle classification: 0.102
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(10)363_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 363), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)363_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1046
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1046 weight vectors
  Containing 225 true matches and 821 true non-matches
    (21.51% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   952  (96.26%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 800

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1045
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 32 matches and 55 non-matches
    Purity of oracle classification:  0.632
    Entropy of oracle classification: 0.949
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 32 matches and 55 non-matches
  Classified 330 matches and 572 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (330, 0.632183908045977, 0.9489804585630242, 0.367816091954023)
    (572, 0.632183908045977, 0.9489804585630242, 0.367816091954023)

Current size of match and non-match training data sets: 32 / 55

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 330 weight vectors
- Estimated match proportion 0.368

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 330 vectors
  The selected farthest weight vectors are:
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.854, 1.000, 0.128, 0.163, 0.042, 0.121, 0.138] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.909, 1.000, 1.000, 1.000, 0.947] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.875, 1.000, 0.574, 0.227, 0.167, 0.117, 0.196] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 41 matches and 29 non-matches
    Purity of oracle classification:  0.586
    Entropy of oracle classification: 0.979
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  29
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)864_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 864), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)864_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1001
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1001 weight vectors
  Containing 198 true matches and 803 true non-matches
    (19.78% true matches)
  Identified 959 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   924  (96.35%)
          2 :    32  (3.34%)
          3 :     2  (0.21%)
          7 :     1  (0.10%)

Identified 0 non-pure unique weight vectors (from 959 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.000 : 783

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 1001
  Number of unique weight vectors: 959

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (959, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 959 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 959 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 872 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 106 matches and 766 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (106, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (766, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 766 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 766 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 13 matches and 60 non-matches
    Purity of oracle classification:  0.822
    Entropy of oracle classification: 0.676
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(15)764_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 764), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)764_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 801
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 801 weight vectors
  Containing 222 true matches and 579 true non-matches
    (27.72% true matches)
  Identified 747 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   710  (95.05%)
          2 :    34  (4.55%)
          3 :     2  (0.27%)
         17 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 747 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 558

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 800
  Number of unique weight vectors: 747

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (747, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 747 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 747 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 662 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 148 matches and 514 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (514, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 514 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 514 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.683, 1.000, 0.246, 0.239, 0.070, 0.255, 0.258] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 8 matches and 64 non-matches
    Purity of oracle classification:  0.889
    Entropy of oracle classification: 0.503
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)6_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 6), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)6_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 634
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 634 weight vectors
  Containing 212 true matches and 422 true non-matches
    (33.44% true matches)
  Identified 582 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   546  (93.81%)
          2 :    33  (5.67%)
          3 :     2  (0.34%)
         16 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 582 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 401

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 633
  Number of unique weight vectors: 582

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (582, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 582 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 582 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 500 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 151 matches and 349 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (349, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 349 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 349 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.767, 0.600, 0.857, 0.636, 0.762] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.714, 0.727, 0.750, 0.294, 0.833] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.348, 0.429, 0.526, 0.529, 0.619] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 0.000, 0.769, 0.500, 0.529, 0.818, 0.789] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 7 matches and 61 non-matches
    Purity of oracle classification:  0.897
    Entropy of oracle classification: 0.478
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)439_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 439), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)439_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1043
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1043 weight vectors
  Containing 222 true matches and 821 true non-matches
    (21.28% true matches)
  Identified 989 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   952  (96.26%)
          2 :    34  (3.44%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 989 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 800

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1042
  Number of unique weight vectors: 989

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (989, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 989 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 989 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 902 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 145 matches and 757 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (757, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 757 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 757 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 10 matches and 63 non-matches
    Purity of oracle classification:  0.863
    Entropy of oracle classification: 0.576
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)620_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 620), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)620_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1073
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1073 weight vectors
  Containing 226 true matches and 847 true non-matches
    (21.06% true matches)
  Identified 1016 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   979  (96.36%)
          2 :    34  (3.35%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1016 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 826

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1072
  Number of unique weight vectors: 1016

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1016, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1016 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1016 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 929 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 332 matches and 597 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (332, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (597, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 597 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 597 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.692, 0.583, 0.500, 0.750, 0.731] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)673_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976378
recall                 0.414716
f-measure               0.58216
da                          127
dm                            0
ndm                           0
tp                          124
fp                            3
tn                  4.76529e+07
fn                          175
Name: (10, 1 - acm diverg, 673), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)673_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 371
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 371 weight vectors
  Containing 137 true matches and 234 true non-matches
    (36.93% true matches)
  Identified 355 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   344  (96.90%)
          2 :     8  (2.25%)
          3 :     2  (0.56%)
          5 :     1  (0.28%)

Identified 0 non-pure unique weight vectors (from 355 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 123
     0.000 : 232

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 371
  Number of unique weight vectors: 355

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (355, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 355 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 355 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 0.000, 0.391, 0.500, 0.625, 0.353, 0.667] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 37 matches and 38 non-matches
    Purity of oracle classification:  0.507
    Entropy of oracle classification: 1.000
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 280 weight vectors
  Based on 37 matches and 38 non-matches
  Classified 221 matches and 59 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 75
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (221, 0.5066666666666667, 0.999871756640849, 0.49333333333333335)
    (59, 0.5066666666666667, 0.999871756640849, 0.49333333333333335)

Current size of match and non-match training data sets: 37 / 38

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 59 weight vectors
- Estimated match proportion 0.493

Sample size for this cluster: 37

Farthest first selection of 37 weight vectors from 59 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.667, 0.857, 0.588, 0.667, 0.385] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.778, 0.636, 0.375, 0.556, 0.625] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.889, 0.875, 0.375, 0.667, 0.533] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [1.000, 0.000, 0.636, 0.571, 0.667, 0.278, 0.778] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.367, 0.733, 0.417, 0.727, 0.474] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.818, 0.636, 0.750, 0.563, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.636, 0.727, 0.278, 0.800, 0.500] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)

Perform oracle with 100.00 accuracy on 37 weight vectors
  The oracle will correctly classify 37 weight vectors and wrongly classify 0
  Classified 0 matches and 37 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  37
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 37 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

127.0
Analisando o arquivo: diverg(20)585_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 585), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)585_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1075
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1075 weight vectors
  Containing 227 true matches and 848 true non-matches
    (21.12% true matches)
  Identified 1018 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   981  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1018 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 827

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1074
  Number of unique weight vectors: 1018

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1018, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1018 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1018 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 931 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 819 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (819, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)608_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 608), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)608_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 853
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 853 weight vectors
  Containing 226 true matches and 627 true non-matches
    (26.49% true matches)
  Identified 796 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   759  (95.35%)
          2 :    34  (4.27%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 796 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 606

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 852
  Number of unique weight vectors: 796

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (796, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 796 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 796 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 711 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 148 matches and 563 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (563, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 563 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 563 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 10 matches and 62 non-matches
    Purity of oracle classification:  0.861
    Entropy of oracle classification: 0.581
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)220_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 220), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)220_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 637
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 637 weight vectors
  Containing 210 true matches and 427 true non-matches
    (32.97% true matches)
  Identified 585 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   549  (93.85%)
          2 :    33  (5.64%)
          3 :     2  (0.34%)
         16 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 585 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 406

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 636
  Number of unique weight vectors: 585

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (585, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 585 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 585 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 503 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 165 matches and 338 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (165, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (338, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 165 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 59

Farthest first selection of 59 weight vectors from 165 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.875, 1.000, 0.182, 0.267, 0.237, 0.206, 0.167] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.143, 0.143, 0.143, 0.133, 0.267] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 59 weight vectors
  The oracle will correctly classify 59 weight vectors and wrongly classify 0
  Classified 51 matches and 8 non-matches
    Purity of oracle classification:  0.864
    Entropy of oracle classification: 0.573
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 59 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)479_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 479), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)479_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1092
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1092 weight vectors
  Containing 221 true matches and 871 true non-matches
    (20.24% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1000  (96.53%)
          2 :    33  (3.19%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 850

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1091
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 845 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (845, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 845 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 845 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)528_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (10, 1 - acm diverg, 528), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)528_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 332
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 332 weight vectors
  Containing 151 true matches and 181 true non-matches
    (45.48% true matches)
  Identified 314 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   304  (96.82%)
          2 :     7  (2.23%)
          3 :     2  (0.64%)
          8 :     1  (0.32%)

Identified 1 non-pure unique weight vectors (from 314 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 135
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 178

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 324
  Number of unique weight vectors: 313

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (313, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 313 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 73

Perform initial selection using "far" method

Farthest first selection of 73 weight vectors from 313 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 31 matches and 42 non-matches
    Purity of oracle classification:  0.575
    Entropy of oracle classification: 0.984
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  42
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 240 weight vectors
  Based on 31 matches and 42 non-matches
  Classified 107 matches and 133 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 73
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (107, 0.5753424657534246, 0.9835585673909616, 0.4246575342465753)
    (133, 0.5753424657534246, 0.9835585673909616, 0.4246575342465753)

Current size of match and non-match training data sets: 31 / 42

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 107 weight vectors
- Estimated match proportion 0.425

Sample size for this cluster: 50

Farthest first selection of 50 weight vectors from 107 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 50 weight vectors
  The oracle will correctly classify 50 weight vectors and wrongly classify 0
  Classified 44 matches and 6 non-matches
    Purity of oracle classification:  0.880
    Entropy of oracle classification: 0.529
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 50 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(15)332_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 332), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)332_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 852
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 852 weight vectors
  Containing 226 true matches and 626 true non-matches
    (26.53% true matches)
  Identified 795 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   758  (95.35%)
          2 :    34  (4.28%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 795 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 605

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 851
  Number of unique weight vectors: 795

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (795, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 795 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 795 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 710 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 147 matches and 563 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (147, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (563, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 563 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 563 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.400, 0.393, 0.318, 0.647, 0.455] (False)
    [1.000, 0.000, 0.846, 0.778, 0.727, 0.632, 0.875] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.750, 1.000, 0.222, 0.095, 0.167, 0.139, 0.278] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 10 matches and 62 non-matches
    Purity of oracle classification:  0.861
    Entropy of oracle classification: 0.581
    Number of true matches:      10
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)121_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.987805
recall                 0.270903
f-measure              0.425197
da                           82
dm                            0
ndm                           0
tp                           81
fp                            1
tn                  4.76529e+07
fn                          218
Name: (15, 1 - acm diverg, 121), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)121_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 266
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 266 weight vectors
  Containing 169 true matches and 97 true non-matches
    (63.53% true matches)
  Identified 249 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   241  (96.79%)
          2 :     5  (2.01%)
          3 :     2  (0.80%)
          9 :     1  (0.40%)

Identified 1 non-pure unique weight vectors (from 249 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 152
     0.889 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 96

Removed 9 non-pure weight vectors

Final number of weight vectors to use: 257
  Number of unique weight vectors: 248

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (248, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 248 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 69

Perform initial selection using "far" method

Farthest first selection of 69 weight vectors from 248 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 36 matches and 33 non-matches
    Purity of oracle classification:  0.522
    Entropy of oracle classification: 0.999
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  33
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 179 weight vectors
  Based on 36 matches and 33 non-matches
  Classified 119 matches and 60 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 69
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (119, 0.5217391304347826, 0.9986359641585718, 0.5217391304347826)
    (60, 0.5217391304347826, 0.9986359641585718, 0.5217391304347826)

Current size of match and non-match training data sets: 36 / 33

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 60 weight vectors
- Estimated match proportion 0.522

Sample size for this cluster: 37

Farthest first selection of 37 weight vectors from 60 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.456, 1.000, 0.087, 0.208, 0.125, 0.152, 0.061] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 1.000, 0.242, 0.121, 0.200, 0.171, 0.000] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.619, 1.000, 0.103, 0.163, 0.129, 0.146, 0.213] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 37 weight vectors
  The oracle will correctly classify 37 weight vectors and wrongly classify 0
  Classified 3 matches and 34 non-matches
    Purity of oracle classification:  0.919
    Entropy of oracle classification: 0.406
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  34
    Number of false non-matches: 0

Deleted 37 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

82.0
Analisando o arquivo: diverg(15)859_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990291
recall                 0.341137
f-measure              0.507463
da                          103
dm                            0
ndm                           0
tp                          102
fp                            1
tn                  4.76529e+07
fn                          197
Name: (15, 1 - acm diverg, 859), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)859_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 342
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 342 weight vectors
  Containing 155 true matches and 187 true non-matches
    (45.32% true matches)
  Identified 324 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   314  (96.91%)
          2 :     7  (2.16%)
          3 :     2  (0.62%)
          8 :     1  (0.31%)

Identified 1 non-pure unique weight vectors (from 324 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 139
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 184

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 334
  Number of unique weight vectors: 323

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (323, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 323 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 74

Perform initial selection using "far" method

Farthest first selection of 74 weight vectors from 323 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 28 matches and 46 non-matches
    Purity of oracle classification:  0.622
    Entropy of oracle classification: 0.957
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 249 weight vectors
  Based on 28 matches and 46 non-matches
  Classified 100 matches and 149 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 74
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (100, 0.6216216216216216, 0.9568886656798214, 0.3783783783783784)
    (149, 0.6216216216216216, 0.9568886656798214, 0.3783783783783784)

Current size of match and non-match training data sets: 28 / 46

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 149 weight vectors
- Estimated match proportion 0.378

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.818, 0.636, 0.313, 0.750, 0.600] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 11 matches and 45 non-matches
    Purity of oracle classification:  0.804
    Entropy of oracle classification: 0.715
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  45
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

103.0
Analisando o arquivo: diverg(20)155_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 155), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)155_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)969_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 969), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)969_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)962_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 962), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)962_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)756_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (15, 1 - acm diverg, 756), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)756_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 793
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 793 weight vectors
  Containing 166 true matches and 627 true non-matches
    (20.93% true matches)
  Identified 754 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   725  (96.15%)
          2 :    26  (3.45%)
          3 :     2  (0.27%)
         10 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 754 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 147
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 606

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 792
  Number of unique weight vectors: 754

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (754, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 754 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 754 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 25 matches and 60 non-matches
    Purity of oracle classification:  0.706
    Entropy of oracle classification: 0.874
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 669 weight vectors
  Based on 25 matches and 60 non-matches
  Classified 89 matches and 580 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (89, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)
    (580, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)

Current size of match and non-match training data sets: 25 / 60

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 89 weight vectors
- Estimated match proportion 0.294

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 89 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.952, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 41 matches and 1 non-matches
    Purity of oracle classification:  0.976
    Entropy of oracle classification: 0.162
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(15)7_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 7), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)7_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 907
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 907 weight vectors
  Containing 200 true matches and 707 true non-matches
    (22.05% true matches)
  Identified 862 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   828  (96.06%)
          2 :    31  (3.60%)
          3 :     2  (0.23%)
         11 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 862 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 686

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 906
  Number of unique weight vectors: 862

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (862, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 862 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 862 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 776 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 158 matches and 618 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (618, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 618 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 618 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 1 matches and 73 non-matches
    Purity of oracle classification:  0.986
    Entropy of oracle classification: 0.103
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)106_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 106), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)106_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 179 matches and 760 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (760, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 179 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 179 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 43 matches and 15 non-matches
    Purity of oracle classification:  0.741
    Entropy of oracle classification: 0.825
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  15
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)428_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 428), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)428_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 817 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 817 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 11 matches and 60 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.622
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)514_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 514), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)514_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 934
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 934 weight vectors
  Containing 200 true matches and 734 true non-matches
    (21.41% true matches)
  Identified 889 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   855  (96.18%)
          2 :    31  (3.49%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 889 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 713

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 933
  Number of unique weight vectors: 889

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (889, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 889 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 889 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 803 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 159 matches and 644 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (644, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 159 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 159 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 44 matches and 11 non-matches
    Purity of oracle classification:  0.800
    Entropy of oracle classification: 0.722
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)445_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 445), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)445_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)463_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 463), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)463_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 820 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 820 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)682_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 682), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)682_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 844
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 844 weight vectors
  Containing 226 true matches and 618 true non-matches
    (26.78% true matches)
  Identified 787 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (95.30%)
          2 :    34  (4.32%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 787 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 597

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 843
  Number of unique weight vectors: 787

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (787, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 787 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 787 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 702 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 160 matches and 542 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (160, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (542, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 160 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 160 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 48 matches and 8 non-matches
    Purity of oracle classification:  0.857
    Entropy of oracle classification: 0.592
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  8
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)389_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 389), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)389_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 179 matches and 760 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (760, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 760 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 760 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)327_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 327), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)327_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 91 matches and 857 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (91, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (857, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 91 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 42

Farthest first selection of 42 weight vectors from 91 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.933, 1.000, 1.000, 0.900, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 0.950, 0.923, 0.941] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 42 weight vectors
  The oracle will correctly classify 42 weight vectors and wrongly classify 0
  Classified 41 matches and 1 non-matches
    Purity of oracle classification:  0.976
    Entropy of oracle classification: 0.162
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 42 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)718_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 718), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)718_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1018
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1018 weight vectors
  Containing 222 true matches and 796 true non-matches
    (21.81% true matches)
  Identified 964 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   927  (96.16%)
          2 :    34  (3.53%)
          3 :     2  (0.21%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 964 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 775

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1017
  Number of unique weight vectors: 964

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (964, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 964 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 964 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 877 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 148 matches and 729 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (729, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 148 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 52

Farthest first selection of 52 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)

Perform oracle with 100.00 accuracy on 52 weight vectors
  The oracle will correctly classify 52 weight vectors and wrongly classify 0
  Classified 49 matches and 3 non-matches
    Purity of oracle classification:  0.942
    Entropy of oracle classification: 0.318
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 52 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(10)289_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 289), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)289_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 726
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 726 weight vectors
  Containing 202 true matches and 524 true non-matches
    (27.82% true matches)
  Identified 676 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   642  (94.97%)
          2 :    31  (4.59%)
          3 :     2  (0.30%)
         16 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 676 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 503

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 725
  Number of unique weight vectors: 676

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (676, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 676 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 676 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 592 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 284 matches and 308 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (284, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (308, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 308 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 308 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.767, 0.545, 0.818, 0.714, 0.773] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.423, 0.478, 0.357, 0.615, 0.727] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [0.800, 0.000, 0.625, 0.571, 0.467, 0.474, 0.667] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.778, 0.500, 0.789, 0.750, 0.385] (False)
    [1.000, 0.000, 0.333, 0.600, 0.800, 0.778, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.833, 0.833, 0.550, 0.500, 0.688] (False)
    [1.000, 0.000, 0.700, 0.818, 0.563, 0.455, 0.278] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.875, 0.467, 0.471, 0.833, 0.571] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.857, 0.000, 0.500, 0.389, 0.235, 0.045, 0.526] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.556, 0.364, 0.583, 0.500, 0.636] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.600, 0.857, 0.579, 0.286, 0.545] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.000, 0.700, 0.818, 0.444, 0.619] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 0 matches and 69 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(15)889_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 889), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)889_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 728
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 728 weight vectors
  Containing 197 true matches and 531 true non-matches
    (27.06% true matches)
  Identified 704 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   687  (97.59%)
          2 :    14  (1.99%)
          3 :     2  (0.28%)
          7 :     1  (0.14%)

Identified 0 non-pure unique weight vectors (from 704 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.000 : 529

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 728
  Number of unique weight vectors: 704

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (704, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 704 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 704 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.346, 0.769, 0.636, 0.419, 0.364] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.778, 0.900, 0.400, 0.350, 0.563] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 29 matches and 55 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.930
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 620 weight vectors
  Based on 29 matches and 55 non-matches
  Classified 134 matches and 486 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (134, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)
    (486, 0.6547619047619048, 0.9297432191769048, 0.34523809523809523)

Current size of match and non-match training data sets: 29 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.93
- Size 486 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 486 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.542, 0.526, 0.850, 0.318, 0.800] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 8 matches and 66 non-matches
    Purity of oracle classification:  0.892
    Entropy of oracle classification: 0.494
    Number of true matches:      8
    Number of false matches:     0
    Number of true non-matches:  66
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

68.0
Analisando o arquivo: diverg(10)86_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 86), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)86_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 432
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 432 weight vectors
  Containing 202 true matches and 230 true non-matches
    (46.76% true matches)
  Identified 406 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   392  (96.55%)
          2 :    11  (2.71%)
          3 :     2  (0.49%)
         12 :     1  (0.25%)

Identified 1 non-pure unique weight vectors (from 406 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 229

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 431
  Number of unique weight vectors: 406

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (406, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 406 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 78

Perform initial selection using "far" method

Farthest first selection of 78 weight vectors from 406 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 78 weight vectors
  The oracle will correctly classify 78 weight vectors and wrongly classify 0
  Classified 38 matches and 40 non-matches
    Purity of oracle classification:  0.513
    Entropy of oracle classification: 1.000
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  40
    Number of false non-matches: 0

Deleted 78 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 328 weight vectors
  Based on 38 matches and 40 non-matches
  Classified 144 matches and 184 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 78
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.5128205128205128, 0.9995256892936493, 0.48717948717948717)
    (184, 0.5128205128205128, 0.9995256892936493, 0.48717948717948717)

Current size of match and non-match training data sets: 38 / 40

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 144 weight vectors
- Estimated match proportion 0.487

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 144 vectors
  The selected farthest weight vectors are:
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 1.000, 0.933, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 49 matches and 9 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.623
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  9
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)24_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 24), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)24_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1071
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1071 weight vectors
  Containing 226 true matches and 845 true non-matches
    (21.10% true matches)
  Identified 1014 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   977  (96.35%)
          2 :    34  (3.35%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1014 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 824

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1070
  Number of unique weight vectors: 1014

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1014, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1014 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1014 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 927 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 325 matches and 602 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (325, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (602, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 602 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 76

Farthest first selection of 76 weight vectors from 602 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.320, 0.545, 0.773, 0.643, 0.591] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.731, 0.652, 0.583, 0.241, 0.229] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 0 matches and 76 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  76
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)540_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (20, 1 - acm diverg, 540), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)540_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 894
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 894 weight vectors
  Containing 190 true matches and 704 true non-matches
    (21.25% true matches)
  Identified 854 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   820  (96.02%)
          2 :    31  (3.63%)
          3 :     2  (0.23%)
          6 :     1  (0.12%)

Identified 0 non-pure unique weight vectors (from 854 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 170
     0.000 : 684

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 894
  Number of unique weight vectors: 854

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (854, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 854 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 854 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 27 matches and 59 non-matches
    Purity of oracle classification:  0.686
    Entropy of oracle classification: 0.898
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 768 weight vectors
  Based on 27 matches and 59 non-matches
  Classified 117 matches and 651 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (117, 0.686046511627907, 0.8976844934141643, 0.313953488372093)
    (651, 0.686046511627907, 0.8976844934141643, 0.313953488372093)

Current size of match and non-match training data sets: 27 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 651 weight vectors
- Estimated match proportion 0.314

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 651 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 11 matches and 62 non-matches
    Purity of oracle classification:  0.849
    Entropy of oracle classification: 0.612
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

69.0
Analisando o arquivo: diverg(10)116_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 116), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)116_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 382
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 382 weight vectors
  Containing 217 true matches and 165 true non-matches
    (56.81% true matches)
  Identified 349 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   333  (95.42%)
          2 :    13  (3.72%)
          3 :     2  (0.57%)
         17 :     1  (0.29%)

Identified 1 non-pure unique weight vectors (from 349 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 164

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 381
  Number of unique weight vectors: 349

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (349, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 349 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 75

Perform initial selection using "far" method

Farthest first selection of 75 weight vectors from 349 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 47 matches and 28 non-matches
    Purity of oracle classification:  0.627
    Entropy of oracle classification: 0.953
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  28
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 274 weight vectors
  Based on 47 matches and 28 non-matches
  Classified 274 matches and 0 non-matches

42.0
Analisando o arquivo: diverg(20)542_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 542), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)542_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1076
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1076 weight vectors
  Containing 227 true matches and 849 true non-matches
    (21.10% true matches)
  Identified 1019 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   982  (96.37%)
          2 :    34  (3.34%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1019 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 828

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1075
  Number of unique weight vectors: 1019

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1019, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1019 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1019 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 932 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 112 matches and 820 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (112, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (820, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 112 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 45

Farthest first selection of 45 weight vectors from 112 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 45 weight vectors
  The oracle will correctly classify 45 weight vectors and wrongly classify 0
  Classified 44 matches and 1 non-matches
    Purity of oracle classification:  0.978
    Entropy of oracle classification: 0.154
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 45 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)210_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (20, 1 - acm diverg, 210), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)210_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 920
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 920 weight vectors
  Containing 215 true matches and 705 true non-matches
    (23.37% true matches)
  Identified 868 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   832  (95.85%)
          2 :    33  (3.80%)
          3 :     2  (0.23%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 868 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 684

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 919
  Number of unique weight vectors: 868

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (868, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 868 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 868 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 782 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 158 matches and 624 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (624, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 158 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 158 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)998_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985507
recall                 0.227425
f-measure              0.369565
da                           69
dm                            0
ndm                           0
tp                           68
fp                            1
tn                  4.76529e+07
fn                          231
Name: (10, 1 - acm diverg, 998), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)998_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 192
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 192 weight vectors
  Containing 167 true matches and 25 true non-matches
    (86.98% true matches)
  Identified 175 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   164  (93.71%)
          2 :     8  (4.57%)
          3 :     2  (1.14%)
          6 :     1  (0.57%)

Identified 0 non-pure unique weight vectors (from 175 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 150
     0.000 : 25

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 192
  Number of unique weight vectors: 175

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (175, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 175 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 62

Perform initial selection using "far" method

Farthest first selection of 62 weight vectors from 175 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 62 weight vectors
  The oracle will correctly classify 62 weight vectors and wrongly classify 0
  Classified 43 matches and 19 non-matches
    Purity of oracle classification:  0.694
    Entropy of oracle classification: 0.889
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  19
    Number of false non-matches: 0

Deleted 62 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 113 weight vectors
  Based on 43 matches and 19 non-matches
  Classified 113 matches and 0 non-matches

69.0
Analisando o arquivo: diverg(20)45_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 45), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)45_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 109 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 44

Farthest first selection of 44 weight vectors from 109 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 44 weight vectors
  The oracle will correctly classify 44 weight vectors and wrongly classify 0
  Classified 43 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.156
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 44 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)634_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 634), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)634_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 389
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 389 weight vectors
  Containing 199 true matches and 190 true non-matches
    (51.16% true matches)
  Identified 362 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   346  (95.58%)
          2 :    13  (3.59%)
          3 :     2  (0.55%)
         11 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 362 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 187

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 388
  Number of unique weight vectors: 362

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (362, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 362 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 362 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 29 matches and 47 non-matches
    Purity of oracle classification:  0.618
    Entropy of oracle classification: 0.959
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 286 weight vectors
  Based on 29 matches and 47 non-matches
  Classified 137 matches and 149 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 76
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (137, 0.618421052631579, 0.959149554396894, 0.3815789473684211)
    (149, 0.618421052631579, 0.959149554396894, 0.3815789473684211)

Current size of match and non-match training data sets: 29 / 47

Selected cluster with (queue ordering: random):
- Purity 0.62 and entropy 0.96
- Size 149 weight vectors
- Estimated match proportion 0.382

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.833, 0.826, 0.733, 0.455, 0.588] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.300, 0.467, 0.750, 0.545, 0.684] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.636, 0.909, 0.313, 0.625, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.667, 0.500, 0.600, 0.500, 0.615] (False)
    [1.000, 1.000, 0.200, 0.200, 0.200, 0.200, 0.214] (False)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 9 matches and 48 non-matches
    Purity of oracle classification:  0.842
    Entropy of oracle classification: 0.629
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(10)421_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 421), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)421_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 441
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 441 weight vectors
  Containing 220 true matches and 221 true non-matches
    (49.89% true matches)
  Identified 405 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   389  (96.05%)
          2 :    13  (3.21%)
          3 :     2  (0.49%)
         20 :     1  (0.25%)

Identified 1 non-pure unique weight vectors (from 405 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 220

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 440
  Number of unique weight vectors: 405

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (405, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 405 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 405 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 39 matches and 38 non-matches
    Purity of oracle classification:  0.506
    Entropy of oracle classification: 1.000
    Number of true matches:      39
    Number of false matches:     0
    Number of true non-matches:  38
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 328 weight vectors
  Based on 39 matches and 38 non-matches
  Classified 159 matches and 169 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.5064935064935064, 0.9998783322990061, 0.5064935064935064)
    (169, 0.5064935064935064, 0.9998783322990061, 0.5064935064935064)

Current size of match and non-match training data sets: 39 / 38

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 169 weight vectors
- Estimated match proportion 0.506

Sample size for this cluster: 61

Farthest first selection of 61 weight vectors from 169 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [0.717, 1.000, 0.240, 0.231, 0.065, 0.192, 0.184] (False)
    [0.817, 1.000, 0.182, 0.115, 0.154, 0.194, 0.111] (False)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [0.800, 1.000, 0.333, 0.267, 0.180, 0.132, 0.281] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.939, 1.000, 0.273, 0.083, 0.227, 0.095, 0.294] (False)
    [1.000, 0.000, 0.625, 0.571, 0.412, 0.474, 0.556] (False)
    [0.817, 1.000, 0.250, 0.212, 0.256, 0.045, 0.250] (False)
    [0.450, 1.000, 0.176, 0.121, 0.242, 0.158, 0.217] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [0.344, 1.000, 0.180, 0.255, 0.171, 0.189, 0.000] (False)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.929, 1.000, 0.182, 0.238, 0.188, 0.146, 0.270] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.367, 1.000, 0.160, 0.170, 0.077, 0.200, 0.178] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.680, 1.000, 0.150, 0.250, 0.333, 0.192, 0.000] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.700, 1.000, 0.250, 0.042, 0.154, 0.222, 0.222] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.663, 1.000, 0.273, 0.244, 0.226, 0.196, 0.238] (False)
    [1.000, 0.000, 0.483, 0.818, 0.773, 0.478, 0.727] (False)
    [0.750, 0.000, 0.206, 0.391, 0.351, 0.261, 0.146] (False)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 0.000, 0.636, 0.800, 0.471, 0.333, 0.625] (False)
    [0.544, 1.000, 0.091, 0.226, 0.255, 0.238, 0.000] (False)
    [0.913, 1.000, 0.184, 0.175, 0.087, 0.233, 0.167] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.715, 1.000, 0.214, 0.125, 0.270, 0.214, 0.167] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.488, 1.000, 0.231, 0.167, 0.105, 0.122, 0.000] (False)
    [0.625, 1.000, 0.217, 0.160, 0.151, 0.094, 0.203] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.750, 1.000, 0.243, 0.243, 0.214, 0.111, 0.132] (False)
    [0.947, 1.000, 0.292, 0.178, 0.227, 0.122, 0.154] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.800, 1.000, 0.259, 0.229, 0.214, 0.258, 0.156] (False)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [0.781, 1.000, 0.231, 0.183, 0.114, 0.245, 0.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)

Perform oracle with 100.00 accuracy on 61 weight vectors
  The oracle will correctly classify 61 weight vectors and wrongly classify 0
  Classified 1 matches and 60 non-matches
    Purity of oracle classification:  0.984
    Entropy of oracle classification: 0.121
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 61 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)870_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 870), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)870_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)827_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 827), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)827_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 391
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 391 weight vectors
  Containing 217 true matches and 174 true non-matches
    (55.50% true matches)
  Identified 358 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   342  (95.53%)
          2 :    13  (3.63%)
          3 :     2  (0.56%)
         17 :     1  (0.28%)

Identified 1 non-pure unique weight vectors (from 358 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 173

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 390
  Number of unique weight vectors: 358

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (358, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 358 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 76

Perform initial selection using "far" method

Farthest first selection of 76 weight vectors from 358 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 76 weight vectors
  The oracle will correctly classify 76 weight vectors and wrongly classify 0
  Classified 43 matches and 33 non-matches
    Purity of oracle classification:  0.566
    Entropy of oracle classification: 0.987
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  33
    Number of false non-matches: 0

Deleted 76 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 282 weight vectors
  Based on 43 matches and 33 non-matches
  Classified 282 matches and 0 non-matches

42.0
Analisando o arquivo: diverg(20)783_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 783), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)783_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 146 matches and 538 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (146, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (538, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 538 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 538 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 9 matches and 65 non-matches
    Purity of oracle classification:  0.878
    Entropy of oracle classification: 0.534
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)723_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 723), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)723_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 789
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 789 weight vectors
  Containing 225 true matches and 564 true non-matches
    (28.52% true matches)
  Identified 750 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   731  (97.47%)
          2 :    16  (2.13%)
          3 :     2  (0.27%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 750 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 561

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 788
  Number of unique weight vectors: 750

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (750, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 750 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 750 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 31 matches and 54 non-matches
    Purity of oracle classification:  0.635
    Entropy of oracle classification: 0.947
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 665 weight vectors
  Based on 31 matches and 54 non-matches
  Classified 150 matches and 515 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)
    (515, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)

Current size of match and non-match training data sets: 31 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.95
- Size 150 weight vectors
- Estimated match proportion 0.365

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 51 matches and 5 non-matches
    Purity of oracle classification:  0.911
    Entropy of oracle classification: 0.434
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)488_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 488), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)488_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 548
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 548 weight vectors
  Containing 226 true matches and 322 true non-matches
    (41.24% true matches)
  Identified 509 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   490  (96.27%)
          2 :    16  (3.14%)
          3 :     2  (0.39%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 509 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 319

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 547
  Number of unique weight vectors: 509

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (509, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 509 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 509 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 33 matches and 48 non-matches
    Purity of oracle classification:  0.593
    Entropy of oracle classification: 0.975
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  48
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 428 weight vectors
  Based on 33 matches and 48 non-matches
  Classified 152 matches and 276 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (152, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)
    (276, 0.5925925925925926, 0.975119064940866, 0.4074074074074074)

Current size of match and non-match training data sets: 33 / 48

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 152 weight vectors
- Estimated match proportion 0.407

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 152 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 53 matches and 5 non-matches
    Purity of oracle classification:  0.914
    Entropy of oracle classification: 0.424
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)167_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 167), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)167_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 546
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 546 weight vectors
  Containing 226 true matches and 320 true non-matches
    (41.39% true matches)
  Identified 507 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   488  (96.25%)
          2 :    16  (3.16%)
          3 :     2  (0.39%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 507 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 317

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 545
  Number of unique weight vectors: 507

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (507, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 507 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 81

Perform initial selection using "far" method

Farthest first selection of 81 weight vectors from 507 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 81 weight vectors
  The oracle will correctly classify 81 weight vectors and wrongly classify 0
  Classified 34 matches and 47 non-matches
    Purity of oracle classification:  0.580
    Entropy of oracle classification: 0.981
    Number of true matches:      34
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 81 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 426 weight vectors
  Based on 34 matches and 47 non-matches
  Classified 151 matches and 275 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 81
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (151, 0.5802469135802469, 0.9813387358307915, 0.41975308641975306)
    (275, 0.5802469135802469, 0.9813387358307915, 0.41975308641975306)

Current size of match and non-match training data sets: 34 / 47

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 151 weight vectors
- Estimated match proportion 0.420

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 151 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 0.900, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 53 matches and 5 non-matches
    Purity of oracle classification:  0.914
    Entropy of oracle classification: 0.424
    Number of true matches:      53
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)156_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.977273
recall                 0.431438
f-measure              0.598608
da                          132
dm                            0
ndm                           0
tp                          129
fp                            3
tn                  4.76529e+07
fn                          170
Name: (10, 1 - acm diverg, 156), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)156_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 506
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 506 weight vectors
  Containing 121 true matches and 385 true non-matches
    (23.91% true matches)
  Identified 493 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   483  (97.97%)
          2 :     7  (1.42%)
          3 :     3  (0.61%)

Identified 0 non-pure unique weight vectors (from 493 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 110
     0.000 : 383

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 506
  Number of unique weight vectors: 493

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (493, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 493 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 493 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.778, 0.429, 0.571, 0.750, 0.600] (False)
    [1.000, 0.000, 0.435, 0.700, 0.600, 0.647, 0.714] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.667, 0.000, 0.500, 0.679, 0.583, 0.588, 0.333] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.950, 0.000, 0.619, 0.800, 0.478, 0.280, 0.625] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 27 matches and 53 non-matches
    Purity of oracle classification:  0.662
    Entropy of oracle classification: 0.922
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 413 weight vectors
  Based on 27 matches and 53 non-matches
  Classified 87 matches and 326 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (87, 0.6625, 0.9224062617590723, 0.3375)
    (326, 0.6625, 0.9224062617590723, 0.3375)

Current size of match and non-match training data sets: 27 / 53

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 326 weight vectors
- Estimated match proportion 0.338

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 326 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [1.000, 0.000, 0.478, 0.786, 0.500, 0.471, 0.429] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [1.000, 0.000, 0.538, 0.677, 0.316, 0.714, 0.381] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.633, 0.867, 0.500, 0.204, 0.396] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 1 matches and 67 non-matches
    Purity of oracle classification:  0.985
    Entropy of oracle classification: 0.111
    Number of true matches:      1
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

132.0
Analisando o arquivo: diverg(10)457_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (10, 1 - acm diverg, 457), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)457_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 778
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 778 weight vectors
  Containing 223 true matches and 555 true non-matches
    (28.66% true matches)
  Identified 724 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   687  (94.89%)
          2 :    34  (4.70%)
          3 :     2  (0.28%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 724 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 534

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 777
  Number of unique weight vectors: 724

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (724, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 724 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 724 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 27 matches and 58 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 639 weight vectors
  Based on 27 matches and 58 non-matches
  Classified 114 matches and 525 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (114, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)
    (525, 0.6823529411764706, 0.9018043446575508, 0.3176470588235294)

Current size of match and non-match training data sets: 27 / 58

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 114 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 114 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 47 matches and 1 non-matches
    Purity of oracle classification:  0.979
    Entropy of oracle classification: 0.146
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)447_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 447), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)447_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 518
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 518 weight vectors
  Containing 219 true matches and 299 true non-matches
    (42.28% true matches)
  Identified 480 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   462  (96.25%)
          2 :    15  (3.12%)
          3 :     2  (0.42%)
         20 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 480 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 183
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 296

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 517
  Number of unique weight vectors: 480

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (480, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 480 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 480 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.545, 0.786, 0.500, 0.444, 0.692] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 37 matches and 43 non-matches
    Purity of oracle classification:  0.537
    Entropy of oracle classification: 0.996
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  43
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 400 weight vectors
  Based on 37 matches and 43 non-matches
  Classified 157 matches and 243 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.5375, 0.9959386076315955, 0.4625)
    (243, 0.5375, 0.9959386076315955, 0.4625)

Current size of match and non-match training data sets: 37 / 43

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 1.00
- Size 157 weight vectors
- Estimated match proportion 0.463

Sample size for this cluster: 60

Farthest first selection of 60 weight vectors from 157 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 60 weight vectors
  The oracle will correctly classify 60 weight vectors and wrongly classify 0
  Classified 48 matches and 12 non-matches
    Purity of oracle classification:  0.800
    Entropy of oracle classification: 0.722
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  12
    Number of false non-matches: 0

Deleted 60 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)345_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (20, 1 - acm diverg, 345), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)345_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1052
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1052 weight vectors
  Containing 223 true matches and 829 true non-matches
    (21.20% true matches)
  Identified 998 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   961  (96.29%)
          2 :    34  (3.41%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 998 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 808

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1051
  Number of unique weight vectors: 998

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (998, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 998 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 998 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 911 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 118 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (118, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (793, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 793 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 14 matches and 59 non-matches
    Purity of oracle classification:  0.808
    Entropy of oracle classification: 0.705
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)151_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (20, 1 - acm diverg, 151), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)151_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 209 true matches and 874 true non-matches
    (19.30% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1001  (96.62%)
          2 :    32  (3.09%)
          3 :     2  (0.19%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 101 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (101, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 101 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 101 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.960, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.842, 0.833, 0.895, 0.833, 0.889] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(20)142_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 142), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)142_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 971
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 971 weight vectors
  Containing 219 true matches and 752 true non-matches
    (22.55% true matches)
  Identified 916 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   880  (96.07%)
          2 :    33  (3.60%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 916 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 970
  Number of unique weight vectors: 916

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (916, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 916 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 916 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 829 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 0 matches and 829 non-matches

40.0
Analisando o arquivo: diverg(10)489_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 489), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)489_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 640
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 640 weight vectors
  Containing 208 true matches and 432 true non-matches
    (32.50% true matches)
  Identified 588 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   552  (93.88%)
          2 :    33  (5.61%)
          3 :     2  (0.34%)
         16 :     1  (0.17%)

Identified 1 non-pure unique weight vectors (from 588 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 411

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 639
  Number of unique weight vectors: 588

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (588, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 588 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 588 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 32 matches and 50 non-matches
    Purity of oracle classification:  0.610
    Entropy of oracle classification: 0.965
    Number of true matches:      32
    Number of false matches:     0
    Number of true non-matches:  50
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 506 weight vectors
  Based on 32 matches and 50 non-matches
  Classified 168 matches and 338 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (168, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)
    (338, 0.6097560975609756, 0.9649567669505688, 0.3902439024390244)

Current size of match and non-match training data sets: 32 / 50

Selected cluster with (queue ordering: random):
- Purity 0.61 and entropy 0.96
- Size 338 weight vectors
- Estimated match proportion 0.390

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 338 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.538, 0.500, 0.818, 0.789, 0.750] (False)
    [1.000, 0.000, 0.769, 0.609, 0.714, 0.765, 0.524] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 0.000, 0.750, 0.778, 0.471, 0.727, 0.684] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 0.000, 0.300, 0.786, 0.818, 0.778, 0.846] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.833, 0.571, 0.727, 0.647, 0.857] (False)
    [1.000, 0.000, 0.857, 0.286, 0.500, 0.643, 0.600] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [0.800, 0.000, 0.625, 0.571, 0.467, 0.474, 0.667] (False)
    [1.000, 0.000, 0.423, 0.478, 0.500, 0.813, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.375, 0.833, 0.800, 0.583, 0.313] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.522, 0.929, 0.526, 0.235, 0.286] (False)
    [1.000, 0.000, 0.583, 0.389, 0.471, 0.545, 0.474] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.385, 0.391, 0.667, 0.579, 0.824] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.583, 0.571, 0.778, 0.471, 0.500] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.042, 0.500, 0.550, 0.875, 0.714] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.818, 0.909, 0.625, 0.500, 0.667] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 0.000, 0.000, 0.700, 0.818, 0.444, 0.619] (False)
    [1.000, 0.000, 0.857, 0.444, 0.556, 0.235, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [1.000, 0.000, 0.333, 0.750, 0.667, 0.667, 0.571] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.067, 0.550, 0.818, 0.727, 0.762] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 0 matches and 72 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)580_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 580), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)580_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1032
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1032 weight vectors
  Containing 222 true matches and 810 true non-matches
    (21.51% true matches)
  Identified 978 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   941  (96.22%)
          2 :    34  (3.48%)
          3 :     2  (0.20%)
         17 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 978 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 789

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1031
  Number of unique weight vectors: 978

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (978, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 978 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 978 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 891 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 154 matches and 737 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (154, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (737, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 737 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 737 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 5 matches and 70 non-matches
    Purity of oracle classification:  0.933
    Entropy of oracle classification: 0.353
    Number of true matches:      5
    Number of false matches:     0
    Number of true non-matches:  70
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(20)557_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 557), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)557_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1086
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1086 weight vectors
  Containing 220 true matches and 866 true non-matches
    (20.26% true matches)
  Identified 1030 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   994  (96.50%)
          2 :    33  (3.20%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1030 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 845

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1085
  Number of unique weight vectors: 1030

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1030, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1030 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1030 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 24 matches and 64 non-matches
    Purity of oracle classification:  0.727
    Entropy of oracle classification: 0.845
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 942 weight vectors
  Based on 24 matches and 64 non-matches
  Classified 86 matches and 856 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (86, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)
    (856, 0.7272727272727273, 0.8453509366224365, 0.2727272727272727)

Current size of match and non-match training data sets: 24 / 64

Selected cluster with (queue ordering: random):
- Purity 0.73 and entropy 0.85
- Size 856 weight vectors
- Estimated match proportion 0.273

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 856 vectors
  The selected farthest weight vectors are:
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 18 matches and 52 non-matches
    Purity of oracle classification:  0.743
    Entropy of oracle classification: 0.822
    Number of true matches:      18
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)663_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.976923
recall                 0.424749
f-measure              0.592075
da                          130
dm                            0
ndm                           0
tp                          127
fp                            3
tn                  4.76529e+07
fn                          172
Name: (10, 1 - acm diverg, 663), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)663_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 921
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 921 weight vectors
  Containing 138 true matches and 783 true non-matches
    (14.98% true matches)
  Identified 887 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   858  (96.73%)
          2 :    26  (2.93%)
          3 :     2  (0.23%)
          5 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 887 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 124
     0.000 : 763

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 921
  Number of unique weight vectors: 887

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (887, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 887 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 887 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 29 matches and 57 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 801 weight vectors
  Based on 29 matches and 57 non-matches
  Classified 244 matches and 557 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (244, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)
    (557, 0.6627906976744186, 0.9221231306777973, 0.3372093023255814)

Current size of match and non-match training data sets: 29 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 557 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 557 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.722, 0.000, 0.875, 0.810, 0.571, 0.643, 0.478] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.500, 0.667, 0.353, 0.556, 0.789] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.667, 0.733, 0.917, 0.714, 0.579] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.462, 0.667, 0.636, 0.368, 0.500] (False)
    [1.000, 0.000, 0.583, 0.452, 0.474, 0.294, 0.667] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.667, 0.400, 0.583, 0.563] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.462, 0.889, 0.455, 0.211, 0.375] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 0 matches and 74 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  74
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

130.0
Analisando o arquivo: diverg(10)423_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.979381
recall                 0.317726
f-measure              0.479798
da                           97
dm                            0
ndm                           0
tp                           95
fp                            2
tn                  4.76529e+07
fn                          204
Name: (10, 1 - acm diverg, 423), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)423_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 955
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 955 weight vectors
  Containing 168 true matches and 787 true non-matches
    (17.59% true matches)
  Identified 918 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   887  (96.62%)
          2 :    28  (3.05%)
          3 :     2  (0.22%)
          6 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 918 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 151
     0.000 : 767

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 955
  Number of unique weight vectors: 918

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (918, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 918 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 918 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 831 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 94 matches and 737 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (94, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (737, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 94 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 94 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 42 matches and 1 non-matches
    Purity of oracle classification:  0.977
    Entropy of oracle classification: 0.159
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  1
    Number of false non-matches: 0

Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

97.0
Analisando o arquivo: diverg(15)210_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                 0.976
recall                 0.408027
f-measure              0.575472
da                          125
dm                            0
ndm                           0
tp                          122
fp                            3
tn                  4.76529e+07
fn                          177
Name: (15, 1 - acm diverg, 210), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)210_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 944
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 944 weight vectors
  Containing 143 true matches and 801 true non-matches
    (15.15% true matches)
  Identified 910 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   881  (96.81%)
          2 :    26  (2.86%)
          3 :     2  (0.22%)
          5 :     1  (0.11%)

Identified 0 non-pure unique weight vectors (from 910 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 129
     0.000 : 781

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 944
  Number of unique weight vectors: 910

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (910, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 910 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 910 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 28 matches and 59 non-matches
    Purity of oracle classification:  0.678
    Entropy of oracle classification: 0.906
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 823 weight vectors
  Based on 28 matches and 59 non-matches
  Classified 96 matches and 727 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (96, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)
    (727, 0.6781609195402298, 0.9063701886077911, 0.3218390804597701)

Current size of match and non-match training data sets: 28 / 59

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 727 weight vectors
- Estimated match proportion 0.322

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 727 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.550, 0.857, 0.833, 0.389, 0.688] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.736, 1.000, 0.250, 0.290, 0.172, 0.188, 0.286] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 3 matches and 72 non-matches
    Purity of oracle classification:  0.960
    Entropy of oracle classification: 0.242
    Number of true matches:      3
    Number of false matches:     0
    Number of true non-matches:  72
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

125.0
Analisando o arquivo: diverg(15)494_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.140468
f-measure              0.246334
da                           42
dm                            0
ndm                           0
tp                           42
fp                            0
tn                  4.76529e+07
fn                          257
Name: (15, 1 - acm diverg, 494), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)494_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 669
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 669 weight vectors
  Containing 217 true matches and 452 true non-matches
    (32.44% true matches)
  Identified 636 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   620  (97.48%)
          2 :    13  (2.04%)
          3 :     2  (0.31%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 636 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 451

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 668
  Number of unique weight vectors: 636

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (636, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 636 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 636 vectors
  The selected farthest weight vectors are:
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [0.819, 1.000, 0.222, 0.214, 0.182, 0.214, 0.333] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 31 matches and 52 non-matches
    Purity of oracle classification:  0.627
    Entropy of oracle classification: 0.953
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 553 weight vectors
  Based on 31 matches and 52 non-matches
  Classified 150 matches and 403 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)
    (403, 0.6265060240963856, 0.9533171305598173, 0.37349397590361444)

Current size of match and non-match training data sets: 31 / 52

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 403 weight vectors
- Estimated match proportion 0.373

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 403 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [1.000, 0.000, 0.438, 0.677, 0.211, 0.357, 0.524] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 0.000, 0.300, 0.577, 0.545, 0.355, 0.263] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.481, 0.429, 0.750, 0.350, 0.778] (False)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.667, 0.000, 0.833, 0.526, 0.600, 0.700, 0.500] (False)
    [0.533, 0.000, 0.667, 0.643, 0.500, 0.529, 0.435] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [0.673, 0.000, 0.733, 0.737, 0.500, 0.250, 0.652] (False)
    [1.000, 0.000, 0.318, 0.581, 0.526, 0.250, 0.571] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.770, 0.000, 0.737, 0.667, 0.261, 0.533, 0.391] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.864, 0.667, 0.435, 0.700, 0.600] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.500, 0.826, 0.429, 0.538, 0.636] (False)
    [1.000, 0.000, 0.846, 0.857, 0.353, 0.318, 0.400] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.800, 0.696, 0.882, 0.727, 0.708] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.231, 0.609, 0.643, 0.722, 0.846] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.600, 0.500, 0.600, 0.722, 0.643] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.579, 0.867, 0.500, 0.574, 0.333] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.375, 0.619, 0.400, 0.778, 0.714] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 4 matches and 69 non-matches
    Purity of oracle classification:  0.945
    Entropy of oracle classification: 0.306
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

42.0
Analisando o arquivo: diverg(15)973_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 973), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)973_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 953
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 953 weight vectors
  Containing 201 true matches and 752 true non-matches
    (21.09% true matches)
  Identified 908 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   874  (96.26%)
          2 :    31  (3.41%)
          3 :     2  (0.22%)
         11 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 908 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 731

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 952
  Number of unique weight vectors: 908

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (908, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 908 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 908 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 25 matches and 62 non-matches
    Purity of oracle classification:  0.713
    Entropy of oracle classification: 0.865
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 821 weight vectors
  Based on 25 matches and 62 non-matches
  Classified 110 matches and 711 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (110, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)
    (711, 0.7126436781609196, 0.8652817028791377, 0.28735632183908044)

Current size of match and non-match training data sets: 25 / 62

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 711 weight vectors
- Estimated match proportion 0.287

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 711 vectors
  The selected farthest weight vectors are:
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.667, 0.737, 0.833, 0.818, 0.567] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 13 matches and 58 non-matches
    Purity of oracle classification:  0.817
    Entropy of oracle classification: 0.687
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)10_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 10), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)10_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 961
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 961 weight vectors
  Containing 217 true matches and 744 true non-matches
    (22.58% true matches)
  Identified 906 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   870  (96.03%)
          2 :    33  (3.64%)
          3 :     2  (0.22%)
         19 :     1  (0.11%)

Identified 1 non-pure unique weight vectors (from 906 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 723

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 960
  Number of unique weight vectors: 906

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (906, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 906 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 906 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 26 matches and 61 non-matches
    Purity of oracle classification:  0.701
    Entropy of oracle classification: 0.880
    Number of true matches:      26
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 819 weight vectors
  Based on 26 matches and 61 non-matches
  Classified 135 matches and 684 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (135, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)
    (684, 0.7011494252873564, 0.8798813089176425, 0.2988505747126437)

Current size of match and non-match training data sets: 26 / 61

Selected cluster with (queue ordering: random):
- Purity 0.70 and entropy 0.88
- Size 684 weight vectors
- Estimated match proportion 0.299

Sample size for this cluster: 72

Farthest first selection of 72 weight vectors from 684 vectors
  The selected farthest weight vectors are:
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 72 weight vectors
  The oracle will correctly classify 72 weight vectors and wrongly classify 0
  Classified 13 matches and 59 non-matches
    Purity of oracle classification:  0.819
    Entropy of oracle classification: 0.681
    Number of true matches:      13
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 72 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)866_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 866), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)866_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 848
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 848 weight vectors
  Containing 225 true matches and 623 true non-matches
    (26.53% true matches)
  Identified 791 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   754  (95.32%)
          2 :    34  (4.30%)
          3 :     2  (0.25%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 791 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 188
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 602

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 847
  Number of unique weight vectors: 791

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (791, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 791 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 791 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 25 matches and 60 non-matches
    Purity of oracle classification:  0.706
    Entropy of oracle classification: 0.874
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 706 weight vectors
  Based on 25 matches and 60 non-matches
  Classified 123 matches and 583 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)
    (583, 0.7058823529411765, 0.8739810481273578, 0.29411764705882354)

Current size of match and non-match training data sets: 25 / 60

Selected cluster with (queue ordering: random):
- Purity 0.71 and entropy 0.87
- Size 583 weight vectors
- Estimated match proportion 0.294

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 583 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.423, 0.478, 0.500, 0.813, 0.545] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.833, 1.000, 0.077, 0.067, 0.133, 0.214, 0.000] (True)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.667, 0.429, 0.789, 0.444, 0.462] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 18 matches and 52 non-matches
    Purity of oracle classification:  0.743
    Entropy of oracle classification: 0.822
    Number of true matches:      18
    Number of false matches:     0
    Number of true non-matches:  52
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)912_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 912), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)912_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 771
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 771 weight vectors
  Containing 203 true matches and 568 true non-matches
    (26.33% true matches)
  Identified 721 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   687  (95.28%)
          2 :    31  (4.30%)
          3 :     2  (0.28%)
         16 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 721 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 173
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 547

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 770
  Number of unique weight vectors: 721

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (721, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 721 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 721 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 30 matches and 54 non-matches
    Purity of oracle classification:  0.643
    Entropy of oracle classification: 0.940
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 637 weight vectors
  Based on 30 matches and 54 non-matches
  Classified 140 matches and 497 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (140, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)
    (497, 0.6428571428571429, 0.9402859586706309, 0.35714285714285715)

Current size of match and non-match training data sets: 30 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 497 weight vectors
- Estimated match proportion 0.357

Sample size for this cluster: 75

Farthest first selection of 75 weight vectors from 497 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [1.000, 0.000, 0.375, 0.409, 0.400, 0.333, 0.611] (False)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.778, 0.875, 0.333, 0.900, 0.444] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [1.000, 0.000, 0.917, 0.786, 0.263, 0.500, 0.556] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.808, 0.435, 0.700, 0.538, 0.688] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.391, 0.500, 0.600, 0.529, 0.381] (False)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.667, 0.500, 0.455, 0.259, 0.250] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.857, 0.444, 0.556, 0.235, 0.500] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 75 weight vectors
  The oracle will correctly classify 75 weight vectors and wrongly classify 0
  Classified 4 matches and 71 non-matches
    Purity of oracle classification:  0.947
    Entropy of oracle classification: 0.300
    Number of true matches:      4
    Number of false matches:     0
    Number of true non-matches:  71
    Number of false non-matches: 0

Deleted 75 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(10)108_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (10, 1 - acm diverg, 108), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)108_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 829
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 829 weight vectors
  Containing 227 true matches and 602 true non-matches
    (27.38% true matches)
  Identified 772 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   735  (95.21%)
          2 :    34  (4.40%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 772 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 581

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 828
  Number of unique weight vectors: 772

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (772, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 772 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 772 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 687 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 150 matches and 537 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (537, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 150 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 51 matches and 3 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)769_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (10, 1 - acm diverg, 769), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)769_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 274
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 274 weight vectors
  Containing 200 true matches and 74 true non-matches
    (72.99% true matches)
  Identified 241 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   227  (94.19%)
          2 :    11  (4.56%)
          3 :     2  (0.83%)
         19 :     1  (0.41%)

Identified 1 non-pure unique weight vectors (from 241 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 167
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 73

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 273
  Number of unique weight vectors: 241

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (241, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 241 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 69

Perform initial selection using "far" method

Farthest first selection of 69 weight vectors from 241 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.909, 0.786, 0.417, 0.222, 0.563] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.438, 0.571, 0.444, 0.533, 0.611] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.467, 1.000, 0.231, 0.304, 0.250, 0.115, 0.000] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.857, 0.000, 0.688, 0.500, 0.412, 0.409, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 37 matches and 32 non-matches
    Purity of oracle classification:  0.536
    Entropy of oracle classification: 0.996
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  32
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 172 weight vectors
  Based on 37 matches and 32 non-matches
  Classified 136 matches and 36 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 69
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.5362318840579711, 0.9962088839046743, 0.5362318840579711)
    (36, 0.5362318840579711, 0.9962088839046743, 0.5362318840579711)

Current size of match and non-match training data sets: 37 / 32

Selected cluster with (queue ordering: random):
- Purity 0.54 and entropy 1.00
- Size 136 weight vectors
- Estimated match proportion 0.536

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 136 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.875, 0.778, 0.829, 0.917, 0.826] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 51 matches and 5 non-matches
    Purity of oracle classification:  0.911
    Entropy of oracle classification: 0.434
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)77_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 77), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)77_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1084
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1084 weight vectors
  Containing 227 true matches and 857 true non-matches
    (20.94% true matches)
  Identified 1027 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   990  (96.40%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1027 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1083
  Number of unique weight vectors: 1027

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1027, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1027 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1027 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 30 matches and 58 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 939 weight vectors
  Based on 30 matches and 58 non-matches
  Classified 179 matches and 760 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (179, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)
    (760, 0.6590909090909091, 0.9256859869821299, 0.3409090909090909)

Current size of match and non-match training data sets: 30 / 58

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 179 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 179 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 43 matches and 15 non-matches
    Purity of oracle classification:  0.741
    Entropy of oracle classification: 0.825
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  15
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)371_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 371), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)371_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1093
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1093 weight vectors
  Containing 226 true matches and 867 true non-matches
    (20.68% true matches)
  Identified 1036 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   999  (96.43%)
          2 :    34  (3.28%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1036 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 846

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1092
  Number of unique weight vectors: 1036

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1036, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1036 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1036 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 25 matches and 63 non-matches
    Purity of oracle classification:  0.716
    Entropy of oracle classification: 0.861
    Number of true matches:      25
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 948 weight vectors
  Based on 25 matches and 63 non-matches
  Classified 131 matches and 817 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)
    (817, 0.7159090909090909, 0.8609652558547649, 0.2840909090909091)

Current size of match and non-match training data sets: 25 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 817 weight vectors
- Estimated match proportion 0.284

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 817 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.760, 0.917, 0.500, 0.786, 0.500] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.583, 0.444, 0.818, 0.706, 0.857] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 11 matches and 60 non-matches
    Purity of oracle classification:  0.845
    Entropy of oracle classification: 0.622
    Number of true matches:      11
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)993_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (10, 1 - acm diverg, 993), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)993_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 842
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 842 weight vectors
  Containing 209 true matches and 633 true non-matches
    (24.82% true matches)
  Identified 795 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   760  (95.60%)
          2 :    32  (4.03%)
          3 :     2  (0.25%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 795 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 612

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 841
  Number of unique weight vectors: 795

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (795, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 795 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 795 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.722, 0.471, 0.545, 0.579] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.556, 0.182, 0.500, 0.071, 0.400] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.478, 0.714, 0.700, 0.824, 0.286] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.344, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.300, 0.524, 0.727, 0.762] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 24 matches and 61 non-matches
    Purity of oracle classification:  0.718
    Entropy of oracle classification: 0.859
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 710 weight vectors
  Based on 24 matches and 61 non-matches
  Classified 149 matches and 561 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (149, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)
    (561, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)

Current size of match and non-match training data sets: 24 / 61

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 149 weight vectors
- Estimated match proportion 0.282

Sample size for this cluster: 51

Farthest first selection of 51 weight vectors from 149 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.600, 1.000, 1.000, 1.000, 1.000, 0.952, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 51 weight vectors
  The oracle will correctly classify 51 weight vectors and wrongly classify 0
  Classified 45 matches and 6 non-matches
    Purity of oracle classification:  0.882
    Entropy of oracle classification: 0.523
    Number of true matches:      45
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 51 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(15)519_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 519), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)519_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 597
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 597 weight vectors
  Containing 214 true matches and 383 true non-matches
    (35.85% true matches)
  Identified 563 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   548  (97.34%)
          2 :    12  (2.13%)
          3 :     2  (0.36%)
         19 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 563 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 180
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 382

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 596
  Number of unique weight vectors: 563

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (563, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 563 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 563 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 27 matches and 55 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 481 weight vectors
  Based on 27 matches and 55 non-matches
  Classified 142 matches and 339 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (142, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)
    (339, 0.6707317073170732, 0.9141770436147918, 0.32926829268292684)

Current size of match and non-match training data sets: 27 / 55

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 142 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 142 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 49 matches and 4 non-matches
    Purity of oracle classification:  0.925
    Entropy of oracle classification: 0.386
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  4
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(15)68_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 68), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)68_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 824
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 824 weight vectors
  Containing 221 true matches and 603 true non-matches
    (26.82% true matches)
  Identified 768 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   732  (95.31%)
          2 :    33  (4.30%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 768 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 582

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 823
  Number of unique weight vectors: 768

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (768, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 768 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 768 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.556, 0.182, 0.500, 0.071, 0.400] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.407, 0.818, 0.625, 0.400, 0.889] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 0.963, 1.000, 1.000] (True)
    [1.000, 0.000, 0.037, 0.450, 0.727, 0.400, 0.429] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [0.344, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.033, 0.300, 0.524, 0.727, 0.762] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.333, 0.667, 0.750, 0.909, 0.842] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 24 matches and 61 non-matches
    Purity of oracle classification:  0.718
    Entropy of oracle classification: 0.859
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  61
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 683 weight vectors
  Based on 24 matches and 61 non-matches
  Classified 17 matches and 666 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (17, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)
    (666, 0.7176470588235294, 0.8586370819183629, 0.2823529411764706)

Current size of match and non-match training data sets: 24 / 61

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.86
- Size 666 weight vectors
- Estimated match proportion 0.282

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 666 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.783, 0.357, 0.750, 0.412, 0.238] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.833, 0.500, 0.368, 0.235, 0.429] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.714, 0.545, 0.471, 0.476] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 24 matches and 46 non-matches
    Purity of oracle classification:  0.657
    Entropy of oracle classification: 0.928
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  46
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(10)148_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.984127
recall                 0.207358
f-measure              0.342541
da                           63
dm                            0
ndm                           0
tp                           62
fp                            1
tn                  4.76529e+07
fn                          237
Name: (10, 1 - acm diverg, 148), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)148_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 430
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 430 weight vectors
  Containing 198 true matches and 232 true non-matches
    (46.05% true matches)
  Identified 398 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   382  (95.98%)
          2 :    13  (3.27%)
          3 :     2  (0.50%)
         16 :     1  (0.25%)

Identified 1 non-pure unique weight vectors (from 398 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 229

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 429
  Number of unique weight vectors: 398

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (398, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 398 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 77

Perform initial selection using "far" method

Farthest first selection of 77 weight vectors from 398 vectors
  The selected farthest weight vectors are:
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.750, 1.000, 0.189, 0.324, 0.147, 0.200, 0.226] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.833, 0.550, 0.500, 0.688] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.727, 0.556, 0.818, 0.778] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.304, 0.571, 0.556, 0.588, 0.762] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.622, 1.000, 0.243, 0.000, 0.042, 0.156, 0.120] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 37 matches and 40 non-matches
    Purity of oracle classification:  0.519
    Entropy of oracle classification: 0.999
    Number of true matches:      37
    Number of false matches:     0
    Number of true non-matches:  40
    Number of false non-matches: 0

Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 321 weight vectors
  Based on 37 matches and 40 non-matches
  Classified 261 matches and 60 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 77
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (261, 0.5194805194805194, 0.9989047442823606, 0.4805194805194805)
    (60, 0.5194805194805194, 0.9989047442823606, 0.4805194805194805)

Current size of match and non-match training data sets: 37 / 40

Selected cluster with (queue ordering: random):
- Purity 0.52 and entropy 1.00
- Size 261 weight vectors
- Estimated match proportion 0.481

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 261 vectors
  The selected farthest weight vectors are:
    [0.581, 1.000, 0.091, 0.213, 0.138, 0.206, 0.083] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.758, 1.000, 0.250, 0.056, 0.034, 0.154, 0.103] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.091, 0.148] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.750, 1.000, 0.333, 0.216, 0.139, 0.182, 0.179] (False)
    [0.500, 1.000, 0.244, 0.171, 0.150, 0.194, 0.250] (False)
    [0.804, 1.000, 0.091, 0.175, 0.074, 0.069, 0.111] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.660, 1.000, 0.222, 0.176, 0.174, 0.077, 0.000] (False)
    [0.954, 1.000, 0.250, 0.154, 0.233, 0.364, 0.190] (False)
    [0.875, 1.000, 0.250, 0.333, 0.214, 0.122, 0.111] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.667, 1.000, 0.933, 1.000, 0.947, 1.000, 0.947] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.603, 1.000, 0.269, 0.169, 0.231, 0.258, 0.116] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.750, 1.000, 0.257, 0.184, 0.286, 0.216, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.857, 0.944, 0.214, 0.118, 0.111, 0.125, 0.000] (False)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 1.000, 0.233, 0.293, 0.256, 0.175, 0.327] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.750, 1.000, 0.256, 0.080, 0.286, 0.059, 0.229] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.907, 1.000, 0.667, 0.118, 0.091, 0.063, 0.188] (True)
    [0.600, 0.944, 0.226, 0.174, 0.000, 0.174, 0.059] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.800, 0.944, 0.308, 0.125, 0.040, 0.000, 0.071] (False)
    [0.913, 1.000, 0.184, 0.175, 0.087, 0.233, 0.167] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 47 matches and 23 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.913
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  23
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

63.0
Analisando o arquivo: diverg(15)894_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981132
recall                 0.173913
f-measure              0.295455
da                           53
dm                            0
ndm                           0
tp                           52
fp                            1
tn                  4.76529e+07
fn                          247
Name: (15, 1 - acm diverg, 894), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)894_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 782
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 782 weight vectors
  Containing 213 true matches and 569 true non-matches
    (27.24% true matches)
  Identified 730 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   695  (95.21%)
          2 :    32  (4.38%)
          3 :     2  (0.27%)
         17 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 730 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 548

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 781
  Number of unique weight vectors: 730

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (730, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 730 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 730 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.391, 1.000, 0.130, 0.150, 0.200, 0.150, 0.074] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 30 matches and 55 non-matches
    Purity of oracle classification:  0.647
    Entropy of oracle classification: 0.937
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 645 weight vectors
  Based on 30 matches and 55 non-matches
  Classified 148 matches and 497 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)
    (497, 0.6470588235294118, 0.9366673818775626, 0.35294117647058826)

Current size of match and non-match training data sets: 30 / 55

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 148 weight vectors
- Estimated match proportion 0.353

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 48 matches and 7 non-matches
    Purity of oracle classification:  0.873
    Entropy of oracle classification: 0.550
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

53.0
Analisando o arquivo: diverg(15)684_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 684), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)684_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1050
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1050 weight vectors
  Containing 208 true matches and 842 true non-matches
    (19.81% true matches)
  Identified 1003 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   968  (96.51%)
          2 :    32  (3.19%)
          3 :     2  (0.20%)
         12 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1003 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 181
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 821

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1049
  Number of unique weight vectors: 1003

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1003, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1003 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1003 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 24 matches and 63 non-matches
    Purity of oracle classification:  0.724
    Entropy of oracle classification: 0.850
    Number of true matches:      24
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 916 weight vectors
  Based on 24 matches and 63 non-matches
  Classified 123 matches and 793 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (123, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)
    (793, 0.7241379310344828, 0.8497511372532974, 0.27586206896551724)

Current size of match and non-match training data sets: 24 / 63

Selected cluster with (queue ordering: random):
- Purity 0.72 and entropy 0.85
- Size 793 weight vectors
- Estimated match proportion 0.276

Sample size for this cluster: 70

Farthest first selection of 70 weight vectors from 793 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)

Perform oracle with 100.00 accuracy on 70 weight vectors
  The oracle will correctly classify 70 weight vectors and wrongly classify 0
  Classified 12 matches and 58 non-matches
    Purity of oracle classification:  0.829
    Entropy of oracle classification: 0.661
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 70 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)436_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990566
recall                 0.351171
f-measure              0.518519
da                          106
dm                            0
ndm                           0
tp                          105
fp                            1
tn                  4.76529e+07
fn                          194
Name: (10, 1 - acm diverg, 436), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)436_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 496
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 496 weight vectors
  Containing 137 true matches and 359 true non-matches
    (27.62% true matches)
  Identified 482 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   476  (98.76%)
          2 :     3  (0.62%)
          3 :     2  (0.41%)
          8 :     1  (0.21%)

Identified 1 non-pure unique weight vectors (from 482 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 123
     0.875 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 358

Removed 8 non-pure weight vectors

Final number of weight vectors to use: 488
  Number of unique weight vectors: 481

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (481, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 481 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 481 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.615, 0.714, 0.353, 0.583, 0.571] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.667, 0.000, 0.850, 0.733, 0.652, 0.778, 0.474] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 29 matches and 51 non-matches
    Purity of oracle classification:  0.637
    Entropy of oracle classification: 0.945
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  51
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 401 weight vectors
  Based on 29 matches and 51 non-matches
  Classified 99 matches and 302 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (99, 0.6375, 0.944738828646789, 0.3625)
    (302, 0.6375, 0.944738828646789, 0.3625)

Current size of match and non-match training data sets: 29 / 51

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 99 weight vectors
- Estimated match proportion 0.362

Sample size for this cluster: 47

Farthest first selection of 47 weight vectors from 99 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 47 weight vectors
  The oracle will correctly classify 47 weight vectors and wrongly classify 0
  Classified 41 matches and 6 non-matches
    Purity of oracle classification:  0.872
    Entropy of oracle classification: 0.551
    Number of true matches:      41
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 47 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

106.0
Analisando o arquivo: diverg(20)858_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (20, 1 - acm diverg, 858), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)858_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 732
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 732 weight vectors
  Containing 219 true matches and 513 true non-matches
    (29.92% true matches)
  Identified 677 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   641  (94.68%)
          2 :    33  (4.87%)
          3 :     2  (0.30%)
         19 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 677 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 184
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 492

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 731
  Number of unique weight vectors: 677

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (677, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 677 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 677 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 593 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 148 matches and 445 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (445, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 148 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 54

Farthest first selection of 54 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.903, 0.903, 0.903, 0.903, 0.903] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 1.000, 1.000, 0.875, 0.769, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 0.938, 1.000, 0.900, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)

Perform oracle with 100.00 accuracy on 54 weight vectors
  The oracle will correctly classify 54 weight vectors and wrongly classify 0
  Classified 51 matches and 3 non-matches
    Purity of oracle classification:  0.944
    Entropy of oracle classification: 0.310
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 54 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)633_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 633), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)633_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1083
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1083 weight vectors
  Containing 226 true matches and 857 true non-matches
    (20.87% true matches)
  Identified 1026 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   989  (96.39%)
          2 :    34  (3.31%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1026 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 836

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1082
  Number of unique weight vectors: 1026

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1026, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1026 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1026 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.800, 1.000, 0.261, 0.158, 0.250, 0.038, 0.282] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 28 matches and 60 non-matches
    Purity of oracle classification:  0.682
    Entropy of oracle classification: 0.902
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  60
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 938 weight vectors
  Based on 28 matches and 60 non-matches
  Classified 159 matches and 779 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (159, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)
    (779, 0.6818181818181818, 0.9023932827949789, 0.3181818181818182)

Current size of match and non-match training data sets: 28 / 60

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.90
- Size 159 weight vectors
- Estimated match proportion 0.318

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 159 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.833, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 48 matches and 7 non-matches
    Purity of oracle classification:  0.873
    Entropy of oracle classification: 0.550
    Number of true matches:      48
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)928_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (15, 1 - acm diverg, 928), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)928_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 861
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 861 weight vectors
  Containing 227 true matches and 634 true non-matches
    (26.36% true matches)
  Identified 804 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   767  (95.40%)
          2 :    34  (4.23%)
          3 :     2  (0.25%)
         20 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 804 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 613

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 860
  Number of unique weight vectors: 804

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (804, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 804 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 804 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 28 matches and 58 non-matches
    Purity of oracle classification:  0.674
    Entropy of oracle classification: 0.910
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  58
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 718 weight vectors
  Based on 28 matches and 58 non-matches
  Classified 153 matches and 565 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)
    (565, 0.6744186046511628, 0.9103480624345153, 0.32558139534883723)

Current size of match and non-match training data sets: 28 / 58

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 153 weight vectors
- Estimated match proportion 0.326

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.944, 1.000, 1.000, 1.000, 0.960, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 50 matches and 5 non-matches
    Purity of oracle classification:  0.909
    Entropy of oracle classification: 0.439
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(20)253_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 253), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)253_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.05 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(15)76_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.133779
f-measure              0.235988
da                           40
dm                            0
ndm                           0
tp                           40
fp                            0
tn                  4.76529e+07
fn                          259
Name: (15, 1 - acm diverg, 76), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)76_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 663
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 663 weight vectors
  Containing 217 true matches and 446 true non-matches
    (32.73% true matches)
  Identified 626 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   608  (97.12%)
          2 :    15  (2.40%)
          3 :     2  (0.32%)
         19 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 626 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.947 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 443

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 662
  Number of unique weight vectors: 626

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (626, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 626 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 626 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.818, 0.500, 0.500, 0.250, 0.500] (False)
    [0.667, 0.000, 0.533, 0.737, 0.353, 0.667, 0.478] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.800, 0.000, 0.526, 0.750, 0.250, 0.204, 0.313] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 543 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 144 matches and 399 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (399, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 399 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 399 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.846, 0.542, 0.588, 0.579, 0.423] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.538, 0.778, 0.636, 0.632, 0.563] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [0.733, 1.000, 0.100, 0.135, 0.095, 0.176, 0.282] (False)
    [1.000, 0.000, 0.316, 0.583, 0.435, 0.833, 0.692] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.636, 0.727, 0.389, 0.625, 0.333] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.615, 0.826, 0.286, 0.857, 0.643] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.673, 0.000, 0.500, 0.737, 0.500, 0.818, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.222, 0.643, 0.800, 0.750, 0.692] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 9 matches and 62 non-matches
    Purity of oracle classification:  0.873
    Entropy of oracle classification: 0.548
    Number of true matches:      9
    Number of false matches:     0
    Number of true non-matches:  62
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

40.0
Analisando o arquivo: diverg(20)644_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 644), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)644_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)877_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 877), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)877_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 640
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 640 weight vectors
  Containing 199 true matches and 441 true non-matches
    (31.09% true matches)
  Identified 607 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   591  (97.36%)
          2 :    13  (2.14%)
          3 :     2  (0.33%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 607 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 168
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 438

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 639
  Number of unique weight vectors: 607

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (607, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 607 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 607 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.632, 0.789, 0.667, 0.407, 0.417] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.667, 0.571, 0.563, 0.333, 0.867] (False)
    [1.000, 0.000, 0.385, 0.826, 0.429, 0.769, 0.588] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.857, 0.591, 0.636, 0.783, 0.818] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.391, 0.538, 0.455, 0.548, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 27 matches and 56 non-matches
    Purity of oracle classification:  0.675
    Entropy of oracle classification: 0.910
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 524 weight vectors
  Based on 27 matches and 56 non-matches
  Classified 136 matches and 388 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (136, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)
    (388, 0.6746987951807228, 0.9100534290139191, 0.3253012048192771)

Current size of match and non-match training data sets: 27 / 56

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 388 weight vectors
- Estimated match proportion 0.325

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 388 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.636, 0.786, 0.750, 0.139, 0.313] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.400, 0.737, 0.529, 0.750, 0.367] (False)
    [0.710, 0.000, 0.600, 0.654, 0.273, 0.290, 0.217] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.600, 0.700, 0.600, 0.611, 0.706] (False)
    [1.000, 0.000, 0.296, 0.600, 0.471, 0.600, 0.643] (False)
    [1.000, 0.000, 0.909, 0.500, 0.500, 0.361, 0.625] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.741, 0.474, 0.667, 0.500, 0.300] (False)
    [1.000, 0.000, 0.370, 0.450, 0.750, 0.550, 0.529] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.714, 0.600, 0.647, 0.529] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.615, 0.826, 0.286, 0.857, 0.643] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.800, 1.000, 0.242, 0.121, 0.200, 0.171, 0.000] (False)
    [1.000, 0.000, 0.818, 0.833, 0.412, 0.625, 0.833] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.233, 0.667, 0.688, 0.455, 0.263] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 6 matches and 63 non-matches
    Purity of oracle classification:  0.913
    Entropy of oracle classification: 0.426
    Number of true matches:      6
    Number of false matches:     0
    Number of true non-matches:  63
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(15)377_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 377), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)377_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 727
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 727 weight vectors
  Containing 205 true matches and 522 true non-matches
    (28.20% true matches)
  Identified 701 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   687  (98.00%)
          2 :    11  (1.57%)
          3 :     2  (0.29%)
         12 :     1  (0.14%)

Identified 1 non-pure unique weight vectors (from 701 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 521

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 726
  Number of unique weight vectors: 701

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (701, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 701 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 701 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 27 matches and 57 non-matches
    Purity of oracle classification:  0.679
    Entropy of oracle classification: 0.906
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 617 weight vectors
  Based on 27 matches and 57 non-matches
  Classified 127 matches and 490 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (127, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)
    (490, 0.6785714285714286, 0.9059282160429992, 0.32142857142857145)

Current size of match and non-match training data sets: 27 / 57

Selected cluster with (queue ordering: random):
- Purity 0.68 and entropy 0.91
- Size 490 weight vectors
- Estimated match proportion 0.321

Sample size for this cluster: 71

Farthest first selection of 71 weight vectors from 490 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.565, 0.667, 0.600, 0.412, 0.381] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.747, 1.000, 0.222, 0.314, 0.212, 0.108, 0.277] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.711, 0.000, 0.800, 0.762, 0.857, 0.778, 0.348] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.833, 0.727, 0.818, 0.750, 0.722] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.900, 0.643, 0.318, 0.452, 0.286] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.667, 0.000, 0.650, 0.895, 0.706, 0.455, 0.600] (False)
    [0.667, 0.000, 0.450, 0.733, 0.682, 0.516, 0.263] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.895, 0.625, 0.750, 0.278, 0.188] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.583, 0.875, 0.611, 0.833, 0.778] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.444, 0.643, 0.889, 0.750, 0.643] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.944, 0.231, 0.111, 0.143, 0.214, 0.333] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.455, 0.714, 0.429, 0.550, 0.647] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.367, 0.800, 0.833, 0.306, 0.789] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.267, 0.733, 0.471, 0.833, 0.526] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.783, 0.933, 0.417, 0.315, 0.438] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 12 matches and 59 non-matches
    Purity of oracle classification:  0.831
    Entropy of oracle classification: 0.655
    Number of true matches:      12
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)996_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985714
recall                 0.230769
f-measure              0.373984
da                           70
dm                            0
ndm                           0
tp                           69
fp                            1
tn                  4.76529e+07
fn                          230
Name: (10, 1 - acm diverg, 996), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)996_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 717
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 717 weight vectors
  Containing 193 true matches and 524 true non-matches
    (26.92% true matches)
  Identified 675 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   640  (94.81%)
          2 :    32  (4.74%)
          3 :     2  (0.30%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 675 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 171
     0.000 : 504

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 717
  Number of unique weight vectors: 675

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (675, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 675 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 675 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.481, 0.217, 0.125, 0.148, 0.148] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.267, 1.000, 0.762, 0.727, 0.619] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.450, 0.417, 0.647, 0.000, 0.000] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.476, 0.455, 0.833, 0.636, 0.278] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.423, 0.609, 0.857, 0.361, 0.688] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.387, 1.000, 0.146, 0.200, 0.200, 0.111, 0.115] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 84 weight vectors
  The oracle will correctly classify 84 weight vectors and wrongly classify 0
  Classified 31 matches and 53 non-matches
    Purity of oracle classification:  0.631
    Entropy of oracle classification: 0.950
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 84 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 591 weight vectors
  Based on 31 matches and 53 non-matches
  Classified 285 matches and 306 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 84
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (285, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)
    (306, 0.6309523809523809, 0.9499380214234903, 0.36904761904761907)

Current size of match and non-match training data sets: 31 / 53

Selected cluster with (queue ordering: random):
- Purity 0.63 and entropy 0.95
- Size 306 weight vectors
- Estimated match proportion 0.369

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 306 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.667, 0.333, 0.917, 0.000, 0.000] (False)
    [1.000, 0.000, 0.667, 0.389, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.367, 0.545, 0.238, 0.727, 0.429] (False)
    [1.000, 0.000, 0.067, 0.650, 0.579, 0.500, 0.286] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.524, 0.357, 0.833, 0.194, 0.313] (False)
    [1.000, 0.000, 0.767, 0.545, 0.818, 0.714, 0.773] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.240, 0.714, 0.455, 0.778, 0.591] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.423, 0.478, 0.357, 0.615, 0.727] (False)
    [1.000, 0.000, 0.750, 0.533, 0.294, 0.333, 0.429] (False)
    [0.917, 0.000, 0.524, 0.455, 0.417, 0.875, 0.556] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [0.800, 0.000, 0.625, 0.571, 0.467, 0.474, 0.667] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.214, 0.333, 0.588, 0.476] (False)
    [1.000, 0.000, 0.583, 0.786, 0.842, 0.800, 0.833] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 0.000, 0.778, 0.500, 0.789, 0.750, 0.385] (False)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [1.000, 0.000, 0.333, 0.600, 0.800, 0.778, 0.813] (False)
    [1.000, 0.000, 0.458, 0.909, 0.350, 0.438, 0.375] (False)
    [1.000, 0.000, 0.900, 0.429, 0.412, 0.588, 0.357] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.741, 0.556, 0.667, 0.350, 0.556] (False)
    [1.000, 0.000, 0.833, 0.833, 0.550, 0.500, 0.688] (False)
    [1.000, 0.000, 0.600, 0.857, 0.579, 0.286, 0.545] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.875, 0.467, 0.471, 0.833, 0.571] (False)
    [1.000, 0.000, 0.261, 0.857, 0.800, 0.778, 0.619] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.667, 0.636, 0.500, 0.250, 0.400] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.778, 0.875, 0.833, 0.600, 0.722] (False)
    [0.857, 0.000, 0.500, 0.389, 0.235, 0.045, 0.526] (False)
    [1.000, 0.000, 0.429, 0.571, 0.333, 0.444, 0.400] (False)
    [1.000, 0.000, 0.556, 0.364, 0.583, 0.500, 0.636] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [1.000, 0.000, 0.500, 0.375, 0.417, 0.259, 0.250] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [1.000, 0.000, 0.367, 0.429, 0.571, 0.306, 0.762] (False)
    [1.000, 0.000, 0.000, 0.700, 0.818, 0.444, 0.619] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [1.000, 0.000, 0.407, 0.643, 0.667, 0.500, 0.563] (False)
    [1.000, 0.000, 0.767, 0.667, 0.545, 0.786, 0.773] (False)
    [1.000, 0.000, 0.263, 0.333, 0.708, 0.600, 0.650] (False)
    [1.000, 0.000, 0.550, 0.833, 0.636, 0.875, 0.545] (False)
    [1.000, 0.000, 0.700, 0.214, 0.368, 0.529, 0.714] (False)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.471, 0.643] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [1.000, 0.000, 0.500, 0.364, 0.833, 0.417, 0.786] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 0 matches and 69 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  69
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

70.0
Analisando o arquivo: diverg(10)531_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.990909
recall                 0.364548
f-measure              0.533007
da                          110
dm                            0
ndm                           0
tp                          109
fp                            1
tn                  4.76529e+07
fn                          190
Name: (10, 1 - acm diverg, 531), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)531_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 313
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 313 weight vectors
  Containing 143 true matches and 170 true non-matches
    (45.69% true matches)
  Identified 298 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   289  (96.98%)
          2 :     6  (2.01%)
          3 :     2  (0.67%)
          6 :     1  (0.34%)

Identified 1 non-pure unique weight vectors (from 298 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 130
     0.833 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 167

Removed 6 non-pure weight vectors

Final number of weight vectors to use: 307
  Number of unique weight vectors: 297

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (297, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 297 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 73

Perform initial selection using "far" method

Farthest first selection of 73 weight vectors from 297 vectors
  The selected farthest weight vectors are:
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.733, 0.000, 0.176, 0.348, 0.351, 0.217, 0.188] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 0.000, 0.857, 0.571, 0.556, 0.556, 0.722] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 0.000, 0.636, 0.818, 0.438, 0.313, 0.833] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 31 matches and 42 non-matches
    Purity of oracle classification:  0.575
    Entropy of oracle classification: 0.984
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  42
    Number of false non-matches: 0

Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 224 weight vectors
  Based on 31 matches and 42 non-matches
  Classified 99 matches and 125 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 73
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (99, 0.5753424657534246, 0.9835585673909616, 0.4246575342465753)
    (125, 0.5753424657534246, 0.9835585673909616, 0.4246575342465753)

Current size of match and non-match training data sets: 31 / 42

Selected cluster with (queue ordering: random):
- Purity 0.58 and entropy 0.98
- Size 99 weight vectors
- Estimated match proportion 0.425

Sample size for this cluster: 48

Farthest first selection of 48 weight vectors from 99 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.929, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)

Perform oracle with 100.00 accuracy on 48 weight vectors
  The oracle will correctly classify 48 weight vectors and wrongly classify 0
  Classified 42 matches and 6 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      42
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 48 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

110.0
Analisando o arquivo: diverg(10)470_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 470), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)470_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 582
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 582 weight vectors
  Containing 207 true matches and 375 true non-matches
    (35.57% true matches)
  Identified 549 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   532  (96.90%)
          2 :    14  (2.55%)
          3 :     2  (0.36%)
         16 :     1  (0.18%)

Identified 1 non-pure unique weight vectors (from 549 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 176
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 372

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 581
  Number of unique weight vectors: 549

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (549, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 549 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 82

Perform initial selection using "far" method

Farthest first selection of 82 weight vectors from 549 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.722, 0.895, 0.182, 0.316] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.435, 0.500, 0.500, 0.647, 0.476] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.636, 0.452, 0.632, 0.139, 0.762] (False)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.818, 0.538, 0.545, 0.722, 0.313] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.368, 0.710, 0.826, 0.333, 0.429] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 82 weight vectors
  The oracle will correctly classify 82 weight vectors and wrongly classify 0
  Classified 29 matches and 53 non-matches
    Purity of oracle classification:  0.646
    Entropy of oracle classification: 0.937
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  53
    Number of false non-matches: 0

Deleted 82 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 467 weight vectors
  Based on 29 matches and 53 non-matches
  Classified 150 matches and 317 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 82
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (150, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)
    (317, 0.6463414634146342, 0.9372930661967527, 0.35365853658536583)

Current size of match and non-match training data sets: 29 / 53

Selected cluster with (queue ordering: random):
- Purity 0.65 and entropy 0.94
- Size 150 weight vectors
- Estimated match proportion 0.354

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 150 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.520, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.882] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 51 matches and 5 non-matches
    Purity of oracle classification:  0.911
    Entropy of oracle classification: 0.434
    Number of true matches:      51
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)911_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                  0.99
recall                 0.331104
f-measure              0.496241
da                          100
dm                            0
ndm                           0
tp                           99
fp                            1
tn                  4.76529e+07
fn                          200
Name: (15, 1 - acm diverg, 911), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)911_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1040
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1040 weight vectors
  Containing 167 true matches and 873 true non-matches
    (16.06% true matches)
  Identified 1001 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   972  (97.10%)
          2 :    26  (2.60%)
          3 :     2  (0.20%)
         10 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1001 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 148
     0.900 :  1   (all weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1039
  Number of unique weight vectors: 1001

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1001, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1001 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1001 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 23 matches and 64 non-matches
    Purity of oracle classification:  0.736
    Entropy of oracle classification: 0.833
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  64
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 914 weight vectors
  Based on 23 matches and 64 non-matches
  Classified 57 matches and 857 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (57, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)
    (857, 0.735632183908046, 0.8332661971210124, 0.26436781609195403)

Current size of match and non-match training data sets: 23 / 64

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 857 weight vectors
- Estimated match proportion 0.264

Sample size for this cluster: 69

Farthest first selection of 69 weight vectors from 857 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.767, 1.000, 0.300, 0.250, 0.091, 0.056, 0.076] (False)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 69 weight vectors
  The oracle will correctly classify 69 weight vectors and wrongly classify 0
  Classified 15 matches and 54 non-matches
    Purity of oracle classification:  0.783
    Entropy of oracle classification: 0.755
    Number of true matches:      15
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 69 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

100.0
Analisando o arquivo: diverg(20)855_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 855), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)855_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 0 matches and 956 non-matches

39.0
Analisando o arquivo: diverg(15)550_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (15, 1 - acm diverg, 550), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)550_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 694
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 694 weight vectors
  Containing 200 true matches and 494 true non-matches
    (28.82% true matches)
  Identified 649 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   615  (94.76%)
          2 :    31  (4.78%)
          3 :     2  (0.31%)
         11 :     1  (0.15%)

Identified 1 non-pure unique weight vectors (from 649 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 175
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 473

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 693
  Number of unique weight vectors: 649

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (649, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 649 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 649 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 566 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 156 matches and 410 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (156, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (410, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 156 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 156 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.857, 0.727, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 46 matches and 10 non-matches
    Purity of oracle classification:  0.821
    Entropy of oracle classification: 0.677
    Number of true matches:      46
    Number of false matches:     0
    Number of true non-matches:  10
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(20)482_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 482), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)482_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1101
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1101 weight vectors
  Containing 227 true matches and 874 true non-matches
    (20.62% true matches)
  Identified 1044 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1007  (96.46%)
          2 :    34  (3.26%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1044 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 190
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 853

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1100
  Number of unique weight vectors: 1044

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1044, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1044 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1044 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 956 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 109 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (109, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 847 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 68

Farthest first selection of 68 weight vectors from 847 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.308, 0.250, 0.381, 0.250, 0.200] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.294, 1.000, 0.128, 0.156, 0.152, 0.167, 0.180] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [1.000, 0.000, 0.714, 0.304, 0.533, 0.833, 0.529] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 14 matches and 54 non-matches
    Purity of oracle classification:  0.794
    Entropy of oracle classification: 0.734
    Number of true matches:      14
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(15)638_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 638), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)638_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 266
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 266 weight vectors
  Containing 209 true matches and 57 true non-matches
    (78.57% true matches)
  Identified 235 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   220  (93.62%)
          2 :    12  (5.11%)
          3 :     2  (0.85%)
         16 :     1  (0.43%)

Identified 1 non-pure unique weight vectors (from 235 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 178
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 56

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 265
  Number of unique weight vectors: 235

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (235, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 235 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 68

Perform initial selection using "far" method

Farthest first selection of 68 weight vectors from 235 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.545, 0.714, 0.700, 0.833, 0.462] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.650, 1.000, 0.086, 0.219, 0.143, 0.108, 1.000] (False)
    [1.000, 0.000, 0.818, 0.786, 0.750, 0.306, 0.889] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.667, 0.867, 0.412, 0.647, 0.571] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.600, 0.714, 1.000, 0.611, 0.722] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 0.214, 0.244, 0.103, 0.441, 0.560] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.909, 0.818, 0.700, 0.625, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 68 weight vectors
  The oracle will correctly classify 68 weight vectors and wrongly classify 0
  Classified 38 matches and 30 non-matches
    Purity of oracle classification:  0.559
    Entropy of oracle classification: 0.990
    Number of true matches:      38
    Number of false matches:     0
    Number of true non-matches:  30
    Number of false non-matches: 0

Deleted 68 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 167 weight vectors
  Based on 38 matches and 30 non-matches
  Classified 160 matches and 7 non-matches

  Non-match cluster not large enough for required sample size
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 1
  Number of manual oracle classifications performed: 68
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (160, 0.5588235294117647, 0.9899927915575188, 0.5588235294117647)

Current size of match and non-match training data sets: 38 / 30

Selected cluster with (queue ordering: random):
- Purity 0.56 and entropy 0.99
- Size 160 weight vectors
- Estimated match proportion 0.559

Sample size for this cluster: 60

Farthest first selection of 60 weight vectors from 160 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 0.000, 0.667, 0.750, 0.417, 0.444, 0.750] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [0.436, 1.000, 1.000, 1.000, 1.000, 0.920, 1.000] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 0.750, 0.167, 0.182, 0.000, 0.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.000, 0.700, 0.800, 0.833, 0.647, 0.857] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [1.000, 0.000, 0.545, 0.857, 0.750, 0.500, 0.813] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.909, 0.786, 0.583, 0.444, 0.375] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)

Perform oracle with 100.00 accuracy on 60 weight vectors
  The oracle will correctly classify 60 weight vectors and wrongly classify 0
  Classified 44 matches and 16 non-matches
    Purity of oracle classification:  0.733
    Entropy of oracle classification: 0.837
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  16
    Number of false non-matches: 0

Deleted 60 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(10)679_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.981818
recall                 0.180602
f-measure              0.305085
da                           55
dm                            0
ndm                           0
tp                           54
fp                            1
tn                  4.76529e+07
fn                          245
Name: (10, 1 - acm diverg, 679), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)679_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 665
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 665 weight vectors
  Containing 203 true matches and 462 true non-matches
    (30.53% true matches)
  Identified 614 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   580  (94.46%)
          2 :    31  (5.05%)
          3 :     2  (0.33%)
         17 :     1  (0.16%)

Identified 1 non-pure unique weight vectors (from 614 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.941 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 441

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 664
  Number of unique weight vectors: 614

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (614, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 614 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 83

Perform initial selection using "far" method

Farthest first selection of 83 weight vectors from 614 vectors
  The selected farthest weight vectors are:
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.280, 0.818, 0.727, 0.357, 0.227] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.750, 0.125, 0.667, 0.185, 0.179] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.250, 0.714, 0.500, 0.389, 0.813] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [0.795, 1.000, 0.087, 0.154, 0.269, 0.147, 0.156] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.556, 0.318, 0.333, 0.200, 0.667] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.750, 0.133, 0.235, 0.571, 0.429] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.176, 0.304, 0.351, 0.217, 0.229] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.632, 0.111, 0.375] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 83 weight vectors
  The oracle will correctly classify 83 weight vectors and wrongly classify 0
  Classified 28 matches and 55 non-matches
    Purity of oracle classification:  0.663
    Entropy of oracle classification: 0.922
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  55
    Number of false non-matches: 0

Deleted 83 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 531 weight vectors
  Based on 28 matches and 55 non-matches
  Classified 153 matches and 378 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 83
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (153, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)
    (378, 0.6626506024096386, 0.922259647473802, 0.3373493975903614)

Current size of match and non-match training data sets: 28 / 55

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.92
- Size 153 weight vectors
- Estimated match proportion 0.337

Sample size for this cluster: 55

Farthest first selection of 55 weight vectors from 153 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.800, 1.000, 1.000, 0.118, 0.227, 0.082, 0.061] (False)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.850, 0.125, 0.278, 0.118, 0.167] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)

Perform oracle with 100.00 accuracy on 55 weight vectors
  The oracle will correctly classify 55 weight vectors and wrongly classify 0
  Classified 44 matches and 11 non-matches
    Purity of oracle classification:  0.800
    Entropy of oracle classification: 0.722
    Number of true matches:      44
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 55 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

55.0
Analisando o arquivo: diverg(10)462_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (10, 1 - acm diverg, 462), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)462_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 177
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 177 weight vectors
  Containing 167 true matches and 10 true non-matches
    (94.35% true matches)
  Identified 148 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   135  (91.22%)
          2 :    10  (6.76%)
          3 :     2  (1.35%)
         16 :     1  (0.68%)

Identified 1 non-pure unique weight vectors (from 148 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 138
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 :  9

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 176
  Number of unique weight vectors: 148

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (148, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 148 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 58

Perform initial selection using "far" method

Farthest first selection of 58 weight vectors from 148 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.500, 1.000, 1.000, 1.000, 0.941, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 0.944, 1.000, 0.900, 0.938, 0.867] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 0.958, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 52 matches and 6 non-matches
    Purity of oracle classification:  0.897
    Entropy of oracle classification: 0.480
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  6
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 90 weight vectors
  Based on 52 matches and 6 non-matches
  Classified 90 matches and 0 non-matches

43.0
Analisando o arquivo: diverg(15)808_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.143813
f-measure              0.251462
da                           43
dm                            0
ndm                           0
tp                           43
fp                            0
tn                  4.76529e+07
fn                          256
Name: (15, 1 - acm diverg, 808), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)808_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 901
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 901 weight vectors
  Containing 211 true matches and 690 true non-matches
    (23.42% true matches)
  Identified 849 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   813  (95.76%)
          2 :    33  (3.89%)
          3 :     2  (0.24%)
         16 :     1  (0.12%)

Identified 1 non-pure unique weight vectors (from 849 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.938 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 669

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 900
  Number of unique weight vectors: 849

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (849, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 849 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 86

Perform initial selection using "far" method

Farthest first selection of 86 weight vectors from 849 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 86 weight vectors
  The oracle will correctly classify 86 weight vectors and wrongly classify 0
  Classified 27 matches and 59 non-matches
    Purity of oracle classification:  0.686
    Entropy of oracle classification: 0.898
    Number of true matches:      27
    Number of false matches:     0
    Number of true non-matches:  59
    Number of false non-matches: 0

Deleted 86 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 763 weight vectors
  Based on 27 matches and 59 non-matches
  Classified 157 matches and 606 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 86
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (157, 0.686046511627907, 0.8976844934141643, 0.313953488372093)
    (606, 0.686046511627907, 0.8976844934141643, 0.313953488372093)

Current size of match and non-match training data sets: 27 / 59

Selected cluster with (queue ordering: random):
- Purity 0.69 and entropy 0.90
- Size 606 weight vectors
- Estimated match proportion 0.314

Sample size for this cluster: 73

Farthest first selection of 73 weight vectors from 606 vectors
  The selected farthest weight vectors are:
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.929, 1.000, 0.212, 0.228, 0.250, 0.284, 0.262] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.909, 0.393, 0.500, 0.647, 0.429] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.615, 0.333, 0.688, 0.545, 0.538] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.857, 0.250, 0.667, 0.286, 0.600] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.727, 0.545, 0.263, 0.889, 0.692] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [0.857, 1.000, 0.000, 0.263, 0.190, 0.136, 0.000] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.400, 0.500, 0.579, 0.643, 0.846] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [1.000, 0.000, 0.846, 0.583, 0.579, 0.364, 0.231] (False)

Perform oracle with 100.00 accuracy on 73 weight vectors
  The oracle will correctly classify 73 weight vectors and wrongly classify 0
  Classified 0 matches and 73 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  73
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 73 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

43.0
Analisando o arquivo: diverg(15)396_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (15, 1 - acm diverg, 396), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)396_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 532
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 532 weight vectors
  Containing 218 true matches and 314 true non-matches
    (40.98% true matches)
  Identified 494 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   476  (96.36%)
          2 :    15  (3.04%)
          3 :     2  (0.40%)
         20 :     1  (0.20%)

Identified 1 non-pure unique weight vectors (from 494 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 182
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 311

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 531
  Number of unique weight vectors: 494

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (494, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 494 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 80

Perform initial selection using "far" method

Farthest first selection of 80 weight vectors from 494 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 0.000, 0.857, 0.714, 0.563, 0.278, 0.385] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.300, 0.467, 0.500, 0.818, 0.421] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.900, 0.636, 0.600, 0.818, 0.444] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.889, 0.667, 0.941, 0.500, 0.333] (False)
    [1.000, 0.000, 0.900, 0.800, 0.267, 0.353, 0.647] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.615, 0.333, 0.455, 0.333, 0.286] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.714] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.000, 0.583, 0.615, 0.778, 0.526, 0.611] (False)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [1.000, 1.000, 0.171, 0.140, 0.105, 0.206, 1.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 0.583, 0.722, 0.889, 0.882, 0.786] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.304, 0.714, 0.625, 0.294, 0.238] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)

Perform oracle with 100.00 accuracy on 80 weight vectors
  The oracle will correctly classify 80 weight vectors and wrongly classify 0
  Classified 33 matches and 47 non-matches
    Purity of oracle classification:  0.588
    Entropy of oracle classification: 0.978
    Number of true matches:      33
    Number of false matches:     0
    Number of true non-matches:  47
    Number of false non-matches: 0

Deleted 80 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 414 weight vectors
  Based on 33 matches and 47 non-matches
  Classified 144 matches and 270 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 80
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (144, 0.5875, 0.9777945702913884, 0.4125)
    (270, 0.5875, 0.9777945702913884, 0.4125)

Current size of match and non-match training data sets: 33 / 47

Selected cluster with (queue ordering: random):
- Purity 0.59 and entropy 0.98
- Size 144 weight vectors
- Estimated match proportion 0.412

Sample size for this cluster: 57

Farthest first selection of 57 weight vectors from 144 vectors
  The selected farthest weight vectors are:
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 0.857, 1.000, 0.941, 0.917] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.952, 1.000, 0.813, 0.850, 0.824, 0.929, 0.800] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 57 weight vectors
  The oracle will correctly classify 57 weight vectors and wrongly classify 0
  Classified 52 matches and 5 non-matches
    Purity of oracle classification:  0.912
    Entropy of oracle classification: 0.429
    Number of true matches:      52
    Number of false matches:     0
    Number of true non-matches:  5
    Number of false non-matches: 0

Deleted 57 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(20)266_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 266), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)266_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1073
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1073 weight vectors
  Containing 226 true matches and 847 true non-matches
    (21.06% true matches)
  Identified 1016 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   979  (96.36%)
          2 :    34  (3.35%)
          3 :     2  (0.20%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1016 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 826

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1072
  Number of unique weight vectors: 1016

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1016, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1016 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 1016 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.700, 0.333, 0.750, 0.636, 0.263] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [1.000, 0.000, 0.769, 0.217, 0.786, 0.000, 0.000] (False)
    [1.000, 0.000, 0.917, 0.786, 0.667, 0.472, 0.875] (False)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.846, 0.667, 0.500, 0.194, 0.500] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [1.000, 0.000, 0.500, 0.444, 0.294, 0.182, 0.316] (False)
    [1.000, 0.000, 0.800, 0.609, 0.857, 0.769, 0.579] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.667, 0.000, 0.650, 0.467, 0.706, 0.389, 0.737] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 0.694, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.526, 0.786, 0.304, 0.647, 0.571] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.368, 0.484, 0.708, 0.093, 0.417] (False)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 30 matches and 57 non-matches
    Purity of oracle classification:  0.655
    Entropy of oracle classification: 0.929
    Number of true matches:      30
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 929 weight vectors
  Based on 30 matches and 57 non-matches
  Classified 158 matches and 771 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (158, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)
    (771, 0.6551724137931034, 0.9293636260137187, 0.3448275862068966)

Current size of match and non-match training data sets: 30 / 57

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 158 weight vectors
- Estimated match proportion 0.345

Sample size for this cluster: 56

Farthest first selection of 56 weight vectors from 158 vectors
  The selected farthest weight vectors are:
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.867, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [0.889, 1.000, 0.938, 0.905, 0.696, 0.897, 0.941] (True)
    [1.000, 1.000, 1.000, 0.923, 0.842, 0.824, 1.000] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.880, 1.000, 1.000, 0.929, 1.000, 0.889, 1.000] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 1.000, 0.875, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 0.944, 0.842, 0.917, 0.813, 0.871, 0.833] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 56 weight vectors
  The oracle will correctly classify 56 weight vectors and wrongly classify 0
  Classified 49 matches and 7 non-matches
    Purity of oracle classification:  0.875
    Entropy of oracle classification: 0.544
    Number of true matches:      49
    Number of false matches:     0
    Number of true non-matches:  7
    Number of false non-matches: 0

Deleted 56 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)88_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.980198
recall                 0.331104
f-measure                 0.495
da                          101
dm                            0
ndm                           0
tp                           99
fp                            2
tn                  4.76529e+07
fn                          200
Name: (10, 1 - acm diverg, 88), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)88_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 826
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 826 weight vectors
  Containing 155 true matches and 671 true non-matches
    (18.77% true matches)
  Identified 792 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   762  (96.21%)
          2 :    27  (3.41%)
          3 :     2  (0.25%)
          4 :     1  (0.13%)

Identified 0 non-pure unique weight vectors (from 792 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 141
     0.000 : 651

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 826
  Number of unique weight vectors: 792

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (792, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 792 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 792 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.956, 1.000, 0.700, 0.667, 0.765, 0.667, 0.750] (True)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.900, 1.000, 0.442, 0.239, 0.233, 0.171, 0.122] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.889, 1.000, 0.232, 0.205, 0.211, 0.205, 0.833] (False)
    [1.000, 1.000, 0.944, 0.083, 0.167, 0.063, 0.150] (True)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.000, 0.875, 0.267, 0.294, 0.296, 0.250] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 31 matches and 54 non-matches
    Purity of oracle classification:  0.635
    Entropy of oracle classification: 0.947
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  54
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 707 weight vectors
  Based on 31 matches and 54 non-matches
  Classified 131 matches and 576 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (131, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)
    (576, 0.6352941176470588, 0.9465202215633438, 0.36470588235294116)

Current size of match and non-match training data sets: 31 / 54

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.95
- Size 576 weight vectors
- Estimated match proportion 0.365

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 576 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.632, 0.714, 0.250, 0.130, 0.167] (False)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.867, 0.333, 0.833, 0.143, 0.308] (False)
    [0.800, 0.000, 0.250, 0.429, 0.467, 0.533, 0.444] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [1.000, 0.000, 0.727, 0.571, 0.750, 0.167, 0.813] (False)
    [0.833, 0.000, 0.500, 0.500, 0.444, 0.059, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.667, 0.273, 0.583, 0.444, 0.727] (False)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.000, 0.600, 0.647, 0.667, 0.238] (False)
    [0.667, 0.000, 0.600, 0.737, 0.833, 0.700, 0.467] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.818, 0.917, 0.294, 0.667, 0.556] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.833, 0.857, 0.316, 0.333, 0.300] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 1.000, 0.389, 0.778, 0.706, 0.750] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [1.000, 0.000, 0.731, 0.792, 0.609, 0.867, 0.636] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [1.000, 0.000, 0.667, 0.467, 0.235, 0.083, 0.467] (False)
    [1.000, 0.000, 0.857, 0.417, 0.750, 0.500, 0.455] (False)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.538, 0.333, 0.611, 0.818, 0.654] (False)
    [0.800, 0.000, 0.375, 0.571, 0.333, 0.267, 0.333] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [1.000, 0.000, 0.458, 0.909, 0.250, 0.875, 0.563] (False)
    [1.000, 0.000, 0.692, 0.292, 0.500, 0.818, 0.308] (False)
    [1.000, 0.000, 0.300, 0.357, 0.818, 0.000, 0.000] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [1.000, 0.000, 0.917, 0.786, 0.368, 0.250, 0.833] (False)
    [1.000, 0.000, 0.625, 0.182, 0.417, 0.185, 0.214] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.696, 0.357, 0.909, 0.000, 0.000] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.367, 0.545, 0.333, 0.688, 0.286] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.346, 0.522, 0.765, 0.769, 0.455] (False)
    [1.000, 0.000, 0.700, 0.833, 0.524, 0.636, 0.238] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

101.0
Analisando o arquivo: diverg(20)410_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.130435
f-measure              0.230769
da                           39
dm                            0
ndm                           0
tp                           39
fp                            0
tn                  4.76529e+07
fn                          260
Name: (20, 1 - acm diverg, 410), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)410_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 808
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 808 weight vectors
  Containing 226 true matches and 582 true non-matches
    (27.97% true matches)
  Identified 769 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   750  (97.53%)
          2 :    16  (2.08%)
          3 :     2  (0.26%)
         20 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 769 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 189
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 579

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 807
  Number of unique weight vectors: 769

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (769, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 769 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 769 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.907, 1.000, 0.619, 0.118, 0.091, 0.063, 0.188] (True)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 28 matches and 57 non-matches
    Purity of oracle classification:  0.671
    Entropy of oracle classification: 0.914
    Number of true matches:      28
    Number of false matches:     0
    Number of true non-matches:  57
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 684 weight vectors
  Based on 28 matches and 57 non-matches
  Classified 141 matches and 543 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)
    (543, 0.6705882352941176, 0.914324246431782, 0.32941176470588235)

Current size of match and non-match training data sets: 28 / 57

Selected cluster with (queue ordering: random):
- Purity 0.67 and entropy 0.91
- Size 141 weight vectors
- Estimated match proportion 0.329

Sample size for this cluster: 53

Farthest first selection of 53 weight vectors from 141 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.879, 1.000, 0.750, 0.750, 0.735, 0.733, 0.722] (True)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [0.938, 1.000, 1.000, 0.905, 1.000, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [1.000, 1.000, 0.900, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [0.967, 1.000, 0.800, 0.929, 0.917, 0.714, 0.867] (True)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.833, 1.000, 0.917, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.871, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [0.981, 1.000, 0.938, 0.952, 0.720, 0.677, 0.941] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 53 weight vectors
  The oracle will correctly classify 53 weight vectors and wrongly classify 0
  Classified 50 matches and 3 non-matches
    Purity of oracle classification:  0.943
    Entropy of oracle classification: 0.314
    Number of true matches:      50
    Number of false matches:     0
    Number of true non-matches:  3
    Number of false non-matches: 0

Deleted 53 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

39.0
Analisando o arquivo: diverg(10)799_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                0.9875
recall                 0.264214
f-measure              0.416887
da                           80
dm                            0
ndm                           0
tp                           79
fp                            1
tn                  4.76529e+07
fn                          220
Name: (10, 1 - acm diverg, 799), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)799_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1016
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1016 weight vectors
  Containing 186 true matches and 830 true non-matches
    (18.31% true matches)
  Identified 974 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   943  (96.82%)
          2 :    28  (2.87%)
          3 :     2  (0.21%)
         11 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 974 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 164
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 809

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1015
  Number of unique weight vectors: 974

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (974, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 974 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 87

Perform initial selection using "far" method

Farthest first selection of 87 weight vectors from 974 vectors
  The selected farthest weight vectors are:
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.781, 1.000, 0.407, 0.415, 0.258, 0.222, 0.219] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.667, 0.000, 0.850, 0.750, 0.522, 0.667, 0.300] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [1.000, 1.000, 0.000, 0.000, 0.143, 0.100, 0.000] (True)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.751, 1.000, 0.842, 0.000, 0.000, 0.000, 0.000] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 87 weight vectors
  The oracle will correctly classify 87 weight vectors and wrongly classify 0
  Classified 31 matches and 56 non-matches
    Purity of oracle classification:  0.644
    Entropy of oracle classification: 0.940
    Number of true matches:      31
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 87 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 887 weight vectors
  Based on 31 matches and 56 non-matches
  Classified 290 matches and 597 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 87
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (290, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)
    (597, 0.6436781609195402, 0.9395876193289701, 0.3563218390804598)

Current size of match and non-match training data sets: 31 / 56

Selected cluster with (queue ordering: random):
- Purity 0.64 and entropy 0.94
- Size 597 weight vectors
- Estimated match proportion 0.356

Sample size for this cluster: 77

Farthest first selection of 77 weight vectors from 597 vectors
  The selected farthest weight vectors are:
    [1.000, 0.000, 0.067, 0.300, 0.579, 0.889, 0.571] (False)
    [1.000, 0.000, 0.296, 0.667, 0.421, 0.450, 0.692] (False)
    [1.000, 0.000, 0.750, 0.857, 0.235, 0.636, 0.550] (False)
    [1.000, 0.000, 0.875, 0.545, 0.789, 0.556, 0.385] (False)
    [1.000, 0.000, 0.417, 0.696, 0.824, 0.455, 0.842] (False)
    [1.000, 0.000, 0.400, 0.429, 0.810, 0.364, 0.286] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.500, 0.714, 0.235, 0.857, 0.571] (False)
    [1.000, 0.000, 0.633, 0.833, 0.524, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [1.000, 0.000, 0.458, 0.714, 0.600, 0.194, 0.938] (False)
    [0.800, 0.000, 0.375, 0.143, 0.267, 0.467, 0.333] (False)
    [1.000, 0.000, 0.750, 0.929, 0.789, 0.211, 0.545] (False)
    [1.000, 0.000, 0.043, 0.650, 0.818, 0.471, 0.905] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.714, 0.087, 0.267, 0.571, 0.353] (False)
    [0.667, 0.000, 0.760, 0.455, 0.545, 0.500, 0.227] (False)
    [1.000, 0.000, 0.815, 0.391, 0.571, 0.650, 0.818] (False)
    [1.000, 0.000, 0.739, 0.857, 0.909, 0.765, 0.524] (False)
    [0.667, 0.000, 0.720, 0.455, 0.500, 0.000, 0.000] (False)
    [1.000, 0.000, 0.333, 0.455, 0.688, 0.714, 1.000] (False)
    [1.000, 0.000, 0.467, 0.700, 0.611, 0.000, 0.444] (False)
    [1.000, 0.000, 0.417, 0.348, 0.733, 0.917, 0.706] (False)
    [1.000, 0.000, 0.308, 0.609, 0.471, 0.846, 0.714] (False)
    [1.000, 0.000, 0.462, 0.409, 0.833, 0.263, 0.688] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [1.000, 0.000, 0.526, 0.536, 0.292, 0.241, 0.208] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 0.000, 0.300, 0.500, 0.810, 0.750, 0.238] (False)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)
    [1.000, 0.000, 0.480, 0.786, 0.773, 0.286, 0.273] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 0.000, 0.417, 0.909, 0.500, 0.636, 0.889] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [1.000, 0.000, 0.875, 0.222, 0.556, 0.296, 0.286] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 0.000, 0.000, 0.550, 0.688, 0.625, 0.238] (False)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [0.667, 0.000, 0.800, 0.633, 0.647, 0.500, 0.600] (False)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 0.000, 0.583, 0.500, 0.778, 0.647, 0.643] (False)
    [1.000, 0.000, 0.407, 0.818, 0.875, 0.800, 0.667] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 0.000, 0.714, 0.091, 0.333, 0.286, 0.545] (False)
    [1.000, 0.000, 0.900, 0.833, 0.588, 0.750, 0.278] (False)
    [0.667, 0.000, 0.792, 0.333, 0.700, 0.389, 0.737] (False)
    [0.667, 0.000, 0.360, 0.833, 0.773, 0.786, 0.227] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.800, 0.000, 0.462, 0.636, 0.364, 0.053, 0.625] (False)
    [1.000, 0.000, 0.667, 0.538, 0.455, 0.581, 0.385] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [1.000, 0.000, 0.792, 0.500, 0.550, 0.000, 0.000] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [0.667, 0.000, 0.350, 0.613, 0.632, 0.500, 0.633] (False)
    [0.667, 0.000, 0.600, 0.467, 0.471, 0.722, 0.737] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 0.000, 0.667, 0.476, 0.200, 0.500, 0.688] (False)
    [1.000, 0.000, 0.038, 0.783, 0.786, 0.615, 0.524] (False)
    [1.000, 0.000, 0.833, 0.261, 0.667, 0.846, 0.524] (False)
    [0.833, 0.000, 0.556, 0.304, 0.267, 0.091, 0.588] (False)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 0.000, 0.818, 0.667, 0.458, 0.333, 0.229] (False)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [1.000, 0.000, 0.850, 0.417, 1.000, 0.000, 0.000] (False)
    [1.000, 0.000, 0.381, 0.833, 0.579, 0.778, 0.385] (False)
    [1.000, 0.000, 0.846, 0.643, 0.818, 0.789, 0.875] (False)
    [1.000, 0.000, 0.500, 0.179, 0.636, 0.059, 0.000] (False)
    [0.964, 1.000, 0.140, 0.100, 0.188, 0.000, 0.000] (False)
    [0.667, 0.000, 0.440, 0.909, 0.227, 0.571, 0.409] (False)

Perform oracle with 100.00 accuracy on 77 weight vectors
  The oracle will correctly classify 77 weight vectors and wrongly classify 0
  Classified 0 matches and 77 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      0
    Number of false matches:     0
    Number of true non-matches:  77
    Number of false non-matches: 0

*** Warning: Oracle returns an empty match dictionary ***
Deleted 77 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

80.0
Analisando o arquivo: diverg(20)940_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.978261
recall                 0.150502
f-measure               0.26087
da                           46
dm                            0
ndm                           0
tp                           45
fp                            1
tn                  4.76529e+07
fn                          254
Name: (20, 1 - acm diverg, 940), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(20)940_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 1094
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 1094 weight vectors
  Containing 221 true matches and 873 true non-matches
    (20.20% true matches)
  Identified 1038 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :  1002  (96.53%)
          2 :    33  (3.18%)
          3 :     2  (0.19%)
         20 :     1  (0.10%)

Identified 1 non-pure unique weight vectors (from 1038 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 185
     0.950 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 852

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 1093
  Number of unique weight vectors: 1038

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (1038, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 1038 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 88

Perform initial selection using "far" method

Farthest first selection of 88 weight vectors from 1038 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [1.000, 0.000, 0.048, 0.250, 0.917, 0.875, 0.238] (False)
    [1.000, 0.000, 0.286, 0.357, 0.833, 0.389, 0.385] (False)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.917, 0.000, 0.900, 0.731, 0.273, 0.355, 0.235] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [0.667, 0.000, 0.400, 1.000, 0.455, 0.786, 0.727] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.000, 0.308, 0.261, 0.684, 0.692, 0.846] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.400, 0.833, 0.579, 0.000, 0.000] (False)
    [0.800, 0.000, 0.667, 0.273, 0.500, 0.037, 0.143] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [1.000, 0.000, 0.600, 0.700, 0.545, 0.000, 1.000] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.211, 0.190, 0.471, 0.267, 0.615] (False)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [1.000, 0.000, 0.455, 0.571, 0.500, 0.600, 0.400] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.500, 0.111, 0.235, 0.409, 0.316] (False)
    [1.000, 0.000, 0.808, 0.217, 0.714, 0.538, 0.455] (False)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.050, 0.250, 0.818, 0.000, 0.000] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.625, 0.000, 0.375, 0.500, 0.267, 0.067, 0.500] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.769, 0.522, 0.786, 0.929, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [0.667, 0.000, 0.800, 0.786, 0.455, 0.706, 0.909] (False)
    [1.000, 0.000, 0.067, 0.550, 0.455, 1.000, 0.429] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.500, 0.714, 0.000, 0.000, 0.600] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 0.000, 0.259, 0.818, 0.500, 0.250, 0.556] (False)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 0.000, 0.833, 0.500, 0.550, 1.000, 0.313] (False)
    [1.000, 0.000, 1.000, 0.727, 0.889, 0.500, 0.278] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.033, 0.650, 0.833, 0.727, 0.762] (False)
    [0.667, 0.000, 0.440, 0.818, 0.318, 0.750, 0.273] (False)
    [1.000, 0.000, 0.292, 0.909, 0.750, 0.714, 0.313] (False)
    [1.000, 0.000, 0.000, 0.700, 0.917, 0.222, 0.762] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.667, 0.000, 0.040, 0.550, 0.500, 0.571, 0.909] (False)
    [1.000, 0.000, 0.600, 0.600, 0.688, 0.000, 0.533] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [1.000, 0.000, 0.750, 0.375, 1.000, 0.148, 0.214] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)

Perform oracle with 100.00 accuracy on 88 weight vectors
  The oracle will correctly classify 88 weight vectors and wrongly classify 0
  Classified 23 matches and 65 non-matches
    Purity of oracle classification:  0.739
    Entropy of oracle classification: 0.829
    Number of true matches:      23
    Number of false matches:     0
    Number of true non-matches:  65
    Number of false non-matches: 0

Deleted 88 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 950 weight vectors
  Based on 23 matches and 65 non-matches
  Classified 103 matches and 847 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 88
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (103, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)
    (847, 0.7386363636363636, 0.828797154590015, 0.26136363636363635)

Current size of match and non-match training data sets: 23 / 65

Selected cluster with (queue ordering: random):
- Purity 0.74 and entropy 0.83
- Size 103 weight vectors
- Estimated match proportion 0.261

Sample size for this cluster: 43

Farthest first selection of 43 weight vectors from 103 vectors
  The selected farthest weight vectors are:
    [0.833, 1.000, 0.913, 1.000, 1.000, 0.957, 0.875] (True)
    [0.857, 1.000, 0.930, 0.912, 1.000, 0.936, 1.000] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [0.993, 1.000, 0.905, 0.875, 1.000, 0.833, 0.909] (True)
    [0.644, 1.000, 1.000, 1.000, 0.933, 1.000, 0.900] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [1.000, 1.000, 0.870, 0.875, 0.867, 0.889, 0.900] (True)
    [0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [0.950, 0.778, 0.938, 0.947, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.922, 1.000, 1.000, 1.000, 1.000, 0.933, 0.710] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [1.000, 1.000, 0.970, 0.750, 1.000, 0.905, 1.000] (True)
    [1.000, 1.000, 1.000, 0.875, 1.000, 1.000, 0.833] (True)
    [0.956, 0.694, 1.000, 1.000, 0.969, 1.000, 0.950] (True)
    [0.789, 1.000, 0.920, 0.867, 1.000, 0.909, 0.864] (True)
    [1.000, 1.000, 1.000, 0.842, 0.786, 1.000, 1.000] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.933, 1.000, 0.943, 1.000, 0.917, 0.952, 0.913] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.933, 1.000, 0.800, 0.964, 0.933, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 0.929, 0.917, 0.857, 0.933] (True)
    [1.000, 1.000, 1.000, 1.000, 0.889, 1.000, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.882] (True)
    [1.000, 1.000, 0.929, 0.824, 0.955, 1.000, 0.938] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.644, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.933, 1.000, 0.914, 0.750, 0.917, 0.857, 0.913] (True)
    [1.000, 1.000, 1.000, 1.000, 0.941, 1.000, 0.800] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [1.000, 1.000, 0.929, 0.800, 0.857, 0.857, 0.846] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.824] (True)
    [1.000, 1.000, 0.889, 0.933, 1.000, 1.000, 0.917] (True)
    [0.956, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)

Perform oracle with 100.00 accuracy on 43 weight vectors
  The oracle will correctly classify 43 weight vectors and wrongly classify 0
  Classified 43 matches and 0 non-matches
    Purity of oracle classification:  1.000
    Entropy of oracle classification: 0.000
    Number of true matches:      43
    Number of false matches:     0
    Number of true non-matches:  0
    Number of false non-matches: 0

*** Warning: Oracle returns an empty non-match dictionary ***
Deleted 43 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

46.0
Analisando o arquivo: diverg(15)373_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                 0.190635
f-measure              0.320225
da                           57
dm                            0
ndm                           0
tp                           57
fp                            0
tn                  4.76529e+07
fn                          242
Name: (15, 1 - acm diverg, 373), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)373_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 781
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 781 weight vectors
  Containing 206 true matches and 575 true non-matches
    (26.38% true matches)
  Identified 752 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   735  (97.74%)
          2 :    14  (1.86%)
          3 :     2  (0.27%)
         12 :     1  (0.13%)

Identified 1 non-pure unique weight vectors (from 752 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 179
     0.917 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 572

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 780
  Number of unique weight vectors: 752

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (752, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 752 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 85

Perform initial selection using "far" method

Farthest first selection of 85 weight vectors from 752 vectors
  The selected farthest weight vectors are:
    [0.733, 0.000, 0.176, 0.304, 0.135, 0.174, 0.125] (False)
    [0.911, 1.000, 1.000, 1.000, 0.929, 1.000, 0.600] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [1.000, 0.000, 0.789, 0.833, 0.174, 0.867, 0.714] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [0.667, 0.000, 0.600, 0.857, 0.500, 0.700, 0.267] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [0.500, 0.000, 0.417, 0.500, 0.300, 0.636, 0.765] (False)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.417, 0.357, 0.350, 0.412, 0.625] (False)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 0.000, 0.292, 0.323, 0.800, 0.714, 0.714] (False)
    [0.667, 0.000, 0.826, 0.467, 0.588, 0.722, 0.810] (False)
    [0.687, 1.000, 0.467, 0.500, 0.371, 0.552, 0.517] (False)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 0.000, 0.542, 0.476, 0.300, 1.000, 0.500] (False)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.563, 0.731, 0.182, 0.452, 0.500] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [0.500, 0.000, 0.684, 0.737, 0.250, 0.167, 0.417] (False)
    [1.000, 1.000, 0.188, 0.140, 0.132, 0.162, 1.000] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.929, 0.900, 0.889, 0.929] (False)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)
    [1.000, 0.000, 0.200, 0.800, 0.750, 0.611, 0.684] (False)
    [0.533, 0.000, 0.667, 0.800, 0.857, 0.727, 0.652] (False)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 0.000, 0.421, 0.800, 0.375, 0.222, 0.229] (False)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.875, 0.375, 0.625, 0.259, 0.214] (False)
    [0.876, 0.756, 0.935, 1.000, 0.875, 0.882, 0.267] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [1.000, 1.000, 0.250, 0.167, 0.135, 0.143, 0.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.600, 0.944, 0.382, 0.023, 0.303, 0.397, 0.147] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.778, 0.727, 0.875, 0.833, 0.333] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 0.000, 0.300, 0.733, 0.706, 0.833, 0.263] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [0.966, 1.000, 0.727, 0.125, 0.727, 0.200, 0.217] (True)
    [0.533, 0.000, 0.556, 0.474, 0.750, 0.450, 0.391] (False)
    [1.000, 0.000, 0.786, 0.857, 1.000, 0.194, 0.813] (False)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [1.000, 0.000, 0.810, 1.000, 0.750, 0.000, 0.000] (False)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [0.667, 0.000, 0.750, 0.667, 0.235, 0.722, 0.526] (False)
    [1.000, 1.000, 1.000, 0.138, 0.167, 0.143, 0.048] (False)
    [0.667, 0.000, 0.350, 0.677, 0.737, 0.278, 0.810] (False)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 0.000, 0.474, 0.577, 0.708, 0.519, 0.104] (False)
    [0.833, 0.000, 0.542, 0.714, 0.600, 1.000, 0.813] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [1.000, 0.778, 0.643, 0.667, 0.792, 0.833, 0.706] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.750, 0.000, 0.792, 0.667, 0.792, 0.130, 0.333] (False)

Perform oracle with 100.00 accuracy on 85 weight vectors
  The oracle will correctly classify 85 weight vectors and wrongly classify 0
  Classified 29 matches and 56 non-matches
    Purity of oracle classification:  0.659
    Entropy of oracle classification: 0.926
    Number of true matches:      29
    Number of false matches:     0
    Number of true non-matches:  56
    Number of false non-matches: 0

Deleted 85 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 667 weight vectors
  Based on 29 matches and 56 non-matches
  Classified 141 matches and 526 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 85
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (141, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)
    (526, 0.6588235294117647, 0.9259400597385791, 0.3411764705882353)

Current size of match and non-match training data sets: 29 / 56

Selected cluster with (queue ordering: random):
- Purity 0.66 and entropy 0.93
- Size 526 weight vectors
- Estimated match proportion 0.341

Sample size for this cluster: 74

Farthest first selection of 74 weight vectors from 526 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.786, 0.619, 0.500, 1.000, 0.500] (False)
    [1.000, 0.000, 0.385, 0.714, 0.500, 0.647, 0.643] (False)
    [1.000, 0.000, 0.684, 0.792, 0.261, 0.467, 0.636] (False)
    [1.000, 0.000, 0.778, 0.667, 0.833, 0.833, 0.278] (False)
    [0.433, 1.000, 0.161, 0.172, 0.107, 0.185, 0.000] (False)
    [1.000, 1.000, 0.129, 0.053, 0.050, 0.533, 0.344] (False)
    [1.000, 0.000, 0.867, 1.000, 0.522, 0.680, 0.400] (False)
    [1.000, 0.000, 0.233, 0.533, 0.611, 0.909, 0.737] (False)
    [1.000, 0.000, 0.391, 0.786, 0.588, 0.706, 0.238] (False)
    [0.667, 0.000, 0.750, 0.810, 0.333, 0.714, 0.400] (False)
    [0.635, 1.000, 1.000, 0.176, 0.214, 0.120, 0.143] (False)
    [1.000, 0.000, 0.750, 0.538, 0.409, 0.548, 0.357] (False)
    [0.667, 0.000, 0.760, 0.909, 0.818, 0.500, 0.727] (False)
    [1.000, 0.000, 0.375, 0.385, 0.773, 0.226, 0.313] (False)
    [0.950, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [0.833, 0.000, 0.632, 0.867, 0.500, 0.130, 0.292] (False)
    [1.000, 0.000, 0.269, 0.677, 0.684, 0.385, 0.524] (False)
    [1.000, 0.000, 0.429, 0.417, 0.647, 0.583, 1.000] (False)
    [1.000, 0.000, 0.333, 0.346, 0.364, 0.613, 0.364] (False)
    [0.698, 1.000, 0.431, 0.345, 0.333, 0.323, 0.039] (False)
    [1.000, 0.000, 0.370, 0.321, 0.600, 0.650, 0.643] (False)
    [1.000, 0.000, 0.857, 0.452, 0.526, 0.278, 0.619] (False)
    [0.667, 0.000, 0.550, 0.467, 0.706, 0.444, 0.789] (False)
    [1.000, 0.000, 0.750, 0.417, 0.783, 0.467, 0.563] (False)
    [0.667, 0.000, 0.650, 0.667, 0.353, 0.389, 0.421] (False)
    [0.436, 0.000, 0.700, 0.533, 0.353, 0.444, 0.783] (False)
    [1.000, 0.000, 0.304, 0.452, 0.526, 0.294, 0.810] (False)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [0.667, 0.000, 0.450, 0.692, 0.545, 0.323, 0.167] (False)
    [0.733, 0.000, 0.176, 0.261, 0.216, 0.261, 0.125] (False)
    [0.400, 0.000, 0.750, 0.737, 0.500, 0.800, 0.633] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 0.000, 0.900, 0.789, 0.458, 0.185, 0.521] (False)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [1.000, 0.000, 1.000, 0.583, 0.353, 0.750, 0.625] (False)
    [1.000, 0.000, 0.625, 0.857, 0.667, 0.786, 0.529] (False)
    [0.667, 0.000, 0.850, 0.500, 0.708, 0.333, 0.396] (False)
    [1.000, 0.000, 0.433, 0.867, 0.833, 0.636, 0.737] (False)
    [0.704, 0.000, 0.867, 0.789, 0.353, 0.409, 0.739] (False)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.958, 0.000, 0.750, 0.800, 0.750, 0.000, 0.000] (False)
    [1.000, 0.000, 0.316, 0.867, 0.417, 0.333, 0.271] (False)
    [0.614, 1.000, 0.208, 0.170, 0.216, 0.273, 0.333] (False)
    [1.000, 1.000, 0.229, 0.227, 0.125, 0.122, 0.160] (False)
    [1.000, 0.000, 0.550, 0.737, 0.833, 0.278, 0.533] (False)
    [1.000, 0.000, 0.737, 0.714, 0.167, 0.259, 0.250] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [0.917, 0.000, 0.818, 0.636, 0.833, 0.889, 0.889] (False)
    [1.000, 0.000, 0.778, 0.833, 0.882, 0.417, 1.000] (False)
    [0.667, 0.000, 0.300, 0.467, 0.529, 0.722, 0.684] (False)
    [0.667, 0.000, 0.769, 0.739, 0.786, 0.692, 0.367] (False)
    [1.000, 0.000, 0.818, 0.786, 0.706, 0.333, 0.313] (False)
    [0.720, 1.000, 0.333, 0.333, 0.333, 0.667, 0.667] (False)
    [1.000, 0.000, 0.259, 0.290, 0.421, 0.250, 0.429] (False)
    [1.000, 0.000, 0.500, 0.714, 0.450, 0.412, 0.875] (False)
    [1.000, 0.000, 0.526, 0.792, 0.261, 0.733, 0.471] (False)
    [1.000, 0.000, 0.643, 0.538, 0.545, 0.226, 0.286] (False)
    [0.733, 0.000, 0.500, 0.800, 0.500, 0.909, 0.533] (False)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [0.504, 1.000, 0.167, 0.095, 0.143, 0.135, 1.000] (False)
    [0.667, 0.000, 0.650, 0.600, 0.706, 0.727, 0.850] (False)
    [1.000, 0.000, 0.538, 0.613, 0.789, 0.227, 0.857] (False)
    [0.737, 1.000, 0.211, 0.071, 0.233, 0.111, 0.000] (False)
    [0.667, 0.000, 0.704, 0.300, 0.471, 0.750, 0.474] (False)
    [1.000, 0.000, 0.857, 0.636, 0.818, 0.174, 0.556] (False)
    [1.000, 1.000, 0.194, 0.167, 0.229, 0.222, 0.750] (False)
    [1.000, 0.000, 0.769, 0.905, 1.000, 0.636, 0.412] (False)
    [1.000, 0.000, 0.704, 0.375, 0.348, 0.750, 0.727] (False)
    [0.592, 1.000, 0.229, 0.261, 0.200, 0.857, 0.972] (False)
    [0.677, 0.000, 0.467, 0.613, 0.316, 0.556, 0.652] (False)
    [0.533, 0.000, 0.400, 0.684, 0.600, 0.500, 0.565] (False)

Perform oracle with 100.00 accuracy on 74 weight vectors
  The oracle will correctly classify 74 weight vectors and wrongly classify 0
  Classified 7 matches and 67 non-matches
    Purity of oracle classification:  0.905
    Entropy of oracle classification: 0.452
    Number of true matches:      7
    Number of false matches:     0
    Number of true non-matches:  67
    Number of false non-matches: 0

Deleted 74 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

57.0
Analisando o arquivo: diverg(10)51_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision                     1
recall                  0.19398
f-measure               0.32493
da                           58
dm                            0
ndm                           0
tp                           58
fp                            0
tn                  4.76529e+07
fn                          241
Name: (10, 1 - acm diverg, 51), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(10)51_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 302
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 302 weight vectors
  Containing 196 true matches and 106 true non-matches
    (64.90% true matches)
  Identified 278 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   265  (95.32%)
          2 :    10  (3.60%)
          3 :     2  (0.72%)
         11 :     1  (0.36%)

Identified 1 non-pure unique weight vectors (from 278 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 172
     0.909 :  1   (minority class weight vectors with this pureness to be removed)
     0.000 : 105

Removed 1 non-pure weight vectors

Final number of weight vectors to use: 301
  Number of unique weight vectors: 278

Time to load and analyse the weight vector file: 0.00 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (278, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 278 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 71

Perform initial selection using "far" method

Farthest first selection of 71 weight vectors from 278 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.625, 0.900, 0.857, 0.929, 0.909] (True)
    [1.000, 1.000, 0.439, 0.295, 0.156, 0.613, 0.583] (True)
    [1.000, 1.000, 0.889, 0.800, 0.500, 0.346, 0.833] (True)
    [1.000, 0.000, 0.727, 0.733, 0.294, 0.667, 0.643] (False)
    [0.617, 1.000, 0.778, 0.867, 0.875, 1.000, 0.176] (True)
    [1.000, 1.000, 0.350, 0.500, 0.536, 0.350, 0.333] (True)
    [1.000, 0.000, 0.400, 0.500, 0.818, 0.111, 0.615] (False)
    [0.329, 1.000, 0.143, 0.048, 0.143, 0.162, 0.000] (False)
    [1.000, 1.000, 0.333, 1.000, 0.210, 0.100, 0.214] (True)
    [1.000, 1.000, 1.000, 0.000, 1.000, 1.000, 1.000] (True)
    [0.226, 1.000, 0.667, 0.667, 0.667, 0.667, 0.667] (False)
    [0.520, 1.000, 0.923, 0.000, 0.083, 0.947, 1.000] (False)
    [1.000, 0.444, 1.000, 1.000, 0.859, 0.156, 0.207] (False)
    [1.000, 0.000, 0.833, 0.667, 0.765, 0.773, 0.579] (False)
    [0.650, 1.000, 1.000, 0.350, 0.350, 0.300, 0.300] (False)
    [1.000, 0.000, 0.583, 0.444, 0.944, 0.455, 0.789] (False)
    [0.496, 1.000, 0.800, 0.857, 0.833, 0.762, 0.800] (True)
    [0.680, 0.778, 0.125, 0.571, 0.500, 0.700, 0.667] (False)
    [1.000, 1.000, 1.000, 1.000, 0.650, 1.000, 0.950] (True)
    [0.850, 1.000, 0.733, 1.000, 0.588, 0.615, 0.632] (True)
    [1.000, 1.000, 0.556, 0.529, 1.000, 0.548, 0.316] (True)
    [0.267, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.350, 1.000, 0.643, 0.440, 0.154, 0.739, 0.474] (False)
    [0.633, 1.000, 0.414, 0.109, 0.176, 0.153, 0.000] (False)
    [1.000, 0.556, 1.000, 0.929, 1.000, 0.220, 1.000] (False)
    [1.000, 0.000, 0.348, 0.867, 0.529, 0.706, 0.524] (False)
    [1.000, 0.667, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.517, 1.000, 1.000, 0.500, 0.172, 0.036, 0.095] (False)
    [1.000, 1.000, 0.885, 0.727, 0.778, 0.750, 0.692] (True)
    [1.000, 1.000, 0.630, 0.697, 0.607, 0.615, 0.192] (True)
    [0.900, 1.000, 0.536, 0.195, 0.051, 0.206, 0.440] (False)
    [1.000, 1.000, 0.450, 0.750, 0.200, 0.389, 0.684] (False)
    [0.875, 1.000, 0.125, 0.188, 0.257, 0.171, 1.000] (False)
    [1.000, 0.000, 0.000, 0.857, 0.364, 0.571, 0.476] (False)
    [0.240, 1.000, 1.000, 0.900, 1.000, 0.947, 0.146] (True)
    [0.350, 1.000, 0.194, 0.193, 0.105, 0.176, 1.000] (False)
    [0.725, 1.000, 1.000, 0.960, 1.000, 0.373, 1.000] (True)
    [1.000, 1.000, 0.267, 0.642, 0.486, 0.474, 0.974] (True)
    [1.000, 1.000, 0.214, 1.000, 1.000, 1.000, 0.167] (False)
    [1.000, 1.000, 0.000, 0.042, 0.050, 0.607, 0.339] (False)
    [1.000, 1.000, 0.846, 0.209, 0.600, 0.500, 0.711] (True)
    [1.000, 1.000, 1.000, 0.524, 1.000, 1.000, 1.000] (True)
    [0.707, 1.000, 1.000, 1.000, 1.000, 1.000, 0.941] (True)
    [0.337, 1.000, 1.000, 1.000, 1.000, 0.129, 0.174] (False)
    [0.319, 1.000, 0.667, 0.310, 0.219, 0.172, 0.226] (True)
    [1.000, 0.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 0.000, 0.500, 0.615, 0.294, 0.211, 0.545] (False)
    [1.000, 1.000, 0.765, 0.692, 0.429, 0.778, 0.500] (True)
    [1.000, 1.000, 0.500, 0.257, 0.750, 0.567, 0.550] (True)
    [0.220, 1.000, 0.500, 0.500, 0.500, 1.000, 1.000] (False)
    [1.000, 1.000, 1.000, 0.979, 0.974, 0.357, 0.535] (True)
    [1.000, 0.556, 0.125, 0.182, 0.071, 0.167, 0.115] (False)
    [1.000, 1.000, 1.000, 0.120, 0.118, 0.160, 1.000] (True)
    [1.000, 0.000, 0.650, 0.357, 0.833, 0.000, 0.000] (False)
    [0.956, 1.000, 1.000, 0.071, 0.143, 0.111, 0.000] (True)
    [1.000, 1.000, 0.375, 0.933, 0.313, 1.000, 1.000] (True)
    [1.000, 0.000, 0.857, 1.000, 0.278, 0.400, 0.333] (False)
    [1.000, 0.000, 0.478, 0.857, 0.833, 0.472, 0.762] (False)
    [0.338, 1.000, 0.591, 0.765, 0.818, 1.000, 1.000] (True)
    [1.000, 1.000, 0.667, 0.250, 0.000, 0.857, 0.889] (True)
    [0.767, 1.000, 0.636, 0.769, 0.176, 0.750, 0.929] (True)
    [0.305, 1.000, 0.900, 0.889, 0.190, 0.226, 0.190] (False)
    [1.000, 1.000, 1.000, 1.000, 0.806, 0.103, 1.000] (True)
    [0.424, 1.000, 0.105, 0.077, 0.067, 0.833, 1.000] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.000] (True)
    [1.000, 0.000, 0.429, 0.286, 0.500, 0.500, 0.778] (False)
    [1.000, 1.000, 0.154, 0.211, 0.162, 0.000, 0.000] (False)
    [0.666, 1.000, 0.150, 0.200, 0.132, 0.194, 0.438] (False)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.000, 0.067] (True)
    [1.000, 1.000, 0.205, 0.708, 0.757, 0.800, 0.806] (False)
    [1.000, 1.000, 1.000, 0.933, 0.167, 0.167, 1.000] (True)

Perform oracle with 100.00 accuracy on 71 weight vectors
  The oracle will correctly classify 71 weight vectors and wrongly classify 0
  Classified 36 matches and 35 non-matches
    Purity of oracle classification:  0.507
    Entropy of oracle classification: 1.000
    Number of true matches:      36
    Number of false matches:     0
    Number of true non-matches:  35
    Number of false non-matches: 0

Deleted 71 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

SVM classification of 207 weight vectors
  Based on 36 matches and 35 non-matches
  Classified 145 matches and 62 non-matches

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 2: Queue length: 2
  Number of manual oracle classifications performed: 71
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (145, 0.5070422535211268, 0.9998568991526107, 0.5070422535211268)
    (62, 0.5070422535211268, 0.9998568991526107, 0.5070422535211268)

Current size of match and non-match training data sets: 36 / 35

Selected cluster with (queue ordering: random):
- Purity 0.51 and entropy 1.00
- Size 145 weight vectors
- Estimated match proportion 0.507

Sample size for this cluster: 58

Farthest first selection of 58 weight vectors from 145 vectors
  The selected farthest weight vectors are:
    [1.000, 1.000, 0.476, 0.429, 0.441, 0.367, 0.237] (True)
    [1.000, 1.000, 0.650, 0.360, 0.100, 0.348, 0.500] (False)
    [0.433, 1.000, 1.000, 1.000, 0.824, 1.000, 1.000] (True)
    [1.000, 1.000, 0.778, 0.906, 0.769, 0.885, 0.864] (True)
    [0.644, 1.000, 0.885, 0.727, 0.833, 0.750, 0.769] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.833, 1.000] (True)
    [0.467, 1.000, 0.917, 0.842, 0.882, 0.171, 0.091] (False)
    [1.000, 0.556, 0.941, 0.957, 0.958, 0.938, 0.979] (False)
    [1.000, 1.000, 0.769, 0.636, 0.667, 0.667, 0.615] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.333, 0.356] (True)
    [0.900, 1.000, 1.000, 0.909, 0.600, 0.875, 0.958] (True)
    [0.867, 1.000, 1.000, 0.867, 0.875, 1.000, 0.941] (True)
    [0.644, 1.000, 0.870, 0.875, 0.800, 0.889, 0.800] (True)
    [1.000, 1.000, 0.500, 0.286, 0.750, 0.613, 0.200] (True)
    [0.807, 1.000, 0.800, 0.842, 1.000, 1.000, 0.875] (True)
    [0.967, 1.000, 0.889, 1.000, 0.857, 0.644, 0.913] (True)
    [0.186, 1.000, 1.000, 1.000, 0.333, 0.111, 0.238] (False)
    [0.380, 1.000, 1.000, 0.933, 1.000, 1.000, 0.714] (True)
    [1.000, 1.000, 0.800, 0.909, 1.000, 0.933, 1.000] (True)
    [1.000, 1.000, 0.698, 0.450, 0.703, 0.508, 0.557] (False)
    [1.000, 1.000, 1.000, 1.000, 0.882, 1.000, 1.000] (True)
    [0.808, 1.000, 0.667, 0.317, 0.516, 0.571, 0.594] (True)
    [1.000, 1.000, 1.000, 0.909, 0.706, 1.000, 1.000] (True)
    [0.975, 1.000, 0.750, 0.889, 0.333, 0.833, 0.813] (True)
    [1.000, 1.000, 1.000, 0.372, 0.833, 1.000, 0.839] (True)
    [0.855, 1.000, 1.000, 0.885, 1.000, 0.773, 1.000] (True)
    [0.917, 1.000, 1.000, 1.000, 1.000, 0.053, 1.000] (True)
    [0.495, 1.000, 1.000, 1.000, 1.000, 0.000, 0.071] (False)
    [1.000, 1.000, 1.000, 0.750, 1.000, 1.000, 1.000] (True)
    [1.000, 0.778, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.789, 0.875, 0.800, 0.867, 0.625] (True)
    [0.867, 1.000, 1.000, 1.000, 0.929, 0.487, 0.474] (True)
    [0.886, 1.000, 0.750, 0.900, 0.889, 1.000, 1.000] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 0.875] (True)
    [0.717, 1.000, 0.778, 0.867, 0.875, 1.000, 0.188] (True)
    [1.000, 1.000, 0.846, 0.652, 0.720, 0.500, 0.789] (True)
    [0.783, 1.000, 1.000, 1.000, 1.000, 1.000, 0.850] (True)
    [0.845, 1.000, 0.849, 0.851, 0.683, 0.418, 0.519] (False)
    [1.000, 1.000, 0.933, 0.867, 0.867, 0.600, 0.600] (False)
    [0.280, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [1.000, 1.000, 0.186, 0.575, 0.788, 0.407, 0.548] (False)
    [0.975, 1.000, 1.000, 1.000, 0.824, 0.786, 1.000] (True)
    [0.870, 1.000, 0.875, 0.857, 0.600, 0.645, 0.882] (True)
    [0.983, 0.556, 1.000, 0.846, 0.957, 0.833, 0.944] (False)
    [0.567, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000] (True)
    [0.825, 1.000, 0.826, 1.000, 1.000, 1.000, 1.000] (True)
    [0.876, 1.000, 0.733, 0.900, 0.900, 0.095, 0.250] (True)
    [0.750, 1.000, 0.778, 1.000, 1.000, 1.000, 0.176] (True)
    [0.812, 1.000, 0.846, 0.955, 0.909, 0.800, 1.000] (True)
    [0.967, 1.000, 1.000, 0.867, 0.875, 1.000, 0.765] (True)
    [1.000, 1.000, 0.895, 0.783, 0.903, 0.800, 0.867] (True)
    [1.000, 1.000, 0.818, 0.800, 0.714, 0.750, 1.000] (True)
    [0.550, 1.000, 0.833, 0.842, 0.923, 1.000, 0.882] (True)
    [1.000, 1.000, 1.000, 1.000, 1.000, 0.148, 0.333] (True)
    [1.000, 0.000, 1.000, 1.000, 0.950, 1.000, 1.000] (False)
    [0.917, 1.000, 0.800, 0.833, 0.769, 0.750, 0.778] (True)
    [0.867, 1.000, 0.700, 0.667, 0.824, 0.667, 0.667] (True)
    [1.000, 1.000, 0.810, 0.760, 0.800, 0.417, 0.833] (True)

Perform oracle with 100.00 accuracy on 58 weight vectors
  The oracle will correctly classify 58 weight vectors and wrongly classify 0
  Classified 47 matches and 11 non-matches
    Purity of oracle classification:  0.810
    Entropy of oracle classification: 0.701
    Number of true matches:      47
    Number of false matches:     0
    Number of true non-matches:  11
    Number of false non-matches: 0

Deleted 58 weight vectors (classified by oracle) from cluster

Cluster not pure enough or too large, and can be split further

Reached end of manual classification budget

58.0
Analisando o arquivo: diverg(15)232_NEW.csv
<class 'pandas.core.series.Series'>
Linha atual aqui, jovem!
(13,)
abordagem                    DS
iteracao                      0
inspecoesManuais              0
precision              0.985294
recall                  0.22408
f-measure              0.365123
da                           68
dm                            0
ndm                           0
tp                           67
fp                            1
tn                  4.76529e+07
fn                          232
Name: (15, 1 - acm diverg, 232), dtype: object

Load weight vector file: ../csv/conjuntosDS/conjuntosDivergAA/diverg(15)232_NEW.csv
  Weights to use: ['title', 'artist', 'track01', 'track02', 'track03', 'track10', 'track11']
  Number of weight vectors: 708
    Number of entity ID pairs that occurred more than once: 0

Analyse set of 708 weight vectors
  Containing 196 true matches and 512 true non-matches
    (27.68% true matches)
  Identified 684 unique weight vectors
  Frequency distribution of occurences of weight vectors:
    Occurence : Number of weight vectors that occur that often
          1 :   667  (97.51%)
          2 :    14  (2.05%)
          3 :     2  (0.29%)
          7 :     1  (0.15%)

Identified 0 non-pure unique weight vectors (from 684 unique weight vectors)
Pureness (as percentage of matches) for a certain unique weight vector:
  Pureness : Count
     1.000 : 174
     0.000 : 510

Removed 0 non-pure weight vectors

Final number of weight vectors to use: 708
  Number of unique weight vectors: 684

Time to load and analyse the weight vector file: 0.01 sec

Initial estimated match proportion: 0.500

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
Loop 1: Queue length: 1
  Number of manual oracle classifications performed: 0
  Size, purity, entropy, and estimated match proportion of clusters in queue:
    (684, 0.5, 1.0, 0.5)

Current size of match and non-match training data sets: 0 / 0

Selected cluster with (queue ordering: random):
- Purity 0.50 and entropy 1.00
- Size 684 weight vectors
- Estimated match proportion 0.500

Sample size for this cluster: 84

Perform initial selection using "far" method

Farthest first selection of 84 weight vectors from 684 vectors